issues
search
JonasGeiping
/
cramming
Cramming the training of a (BERT-type) language model into limited compute.
MIT License
1.3k
stars
100
forks
source link
Fix bug in batch size schedule
#20
Closed
JeanKaddour
closed
1 year ago
JonasGeiping
commented
1 year ago
Oh no, I never upstreamed this fix. Thanks!
Oh no, I never upstreamed this fix. Thanks!