issues
search
JonasGeiping
/
cramming
Cramming the training of a (BERT-type) language model into limited compute.
MIT License
1.29k
stars
100
forks
source link
Determinist and update train.scheduler
#47
Closed
BoyuanFeng
closed
3 months ago