ConnorJL / GPT2

An implementation of training for GPT2, supports TPUs
MIT License
1.42k stars 338 forks source link

Training 1.5B? #33

Open JulesGM opened 3 years ago

JulesGM commented 3 years ago

Hello,

I was wondering if you were able to train the 1.5B model or the large model on TPUs? Afaik it's too large to fit. I would really like to know if you did succeed. Thanks.