Closed hairzooc closed 4 years ago
Hi, I am not the kamalkraj, but I think train from scratch is difficult because data is too large to train , if you have just one GPU or other,it still need at least one week to train this things.
On Tue, Nov 12, 2019 at 10:44 hairzooc notifications@github.com wrote:
Hi, Thanks for your code :) It's very helpful for me to study ALBERT. As long as I know ALBERT batch size is 4096 on the paper. Have you ever tried to pretrain from scratch via GPU? I've seen your guide for squad fine tuning but couldn't find any information about pretraining from scratch. Please let me know if you have any info on that.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kamalkraj/ALBERT-TF2.0/issues/7?email_source=notifications&email_token=AIEAE4GRKASNGUXZDS3YNHTQTIKAHA5CNFSM4JL5NT22YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HYSKS7A, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIEAE4DRWCY7J3ZIGU5EFVDQTIKAHANCNFSM4JL5NT2Q .
Thanks for your reply. :) Taking 1 week is not a problem for me and I have 8 x TITAN RTX 24GB for now.
Wow!! Great
On Tue, Nov 12, 2019 at 11:52 hairzooc notifications@github.com wrote:
Thanks for your reply. :) Taking 1 week is not a problem for me and I have 8 x TITAN RTX 24GB for now.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/kamalkraj/ALBERT-TF2.0/issues/7?email_source=notifications&email_token=AIEAE4G6YDU6KKLSLR5SMTTQTISBVA5CNFSM4JL5NT22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDY6COQ#issuecomment-552722746, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIEAE4B5Y5KFYMNP67RHJDTQTISBVANCNFSM4JL5NT2Q .
@hairzooc hi, were you able to train your model? how much time it took and how was its performance?
Hi, Thanks for your code :) It's very helpful for me to study ALBERT. As long as I know ALBERT batch size is 4096 on the paper. Have you ever tried to pretrain from scratch via GPU? I've seen your guide for squad fine tuning but couldn't find any information about pretraining from scratch. Please let me know if you have any info on that.