Pretraining from scatch

hairzooc commented 5 years ago

Hi, Thanks for your code :) It's very helpful for me to study ALBERT. As long as I know ALBERT batch size is 4096 on the paper. Have you ever tried to pretrain from scratch via GPU? I've seen your guide for squad fine tuning but couldn't find any information about pretraining from scratch. Please let me know if you have any info on that.

pohanchi commented 5 years ago

Hi, I am not the kamalkraj, but I think train from scratch is difficult because data is too large to train , if you have just one GPU or other,it still need at least one week to train this things.

On Tue, Nov 12, 2019 at 10:44 hairzooc notifications@github.com wrote:

Hi, Thanks for your code :) It's very helpful for me to study ALBERT. As long as I know ALBERT batch size is 4096 on the paper. Have you ever tried to pretrain from scratch via GPU? I've seen your guide for squad fine tuning but couldn't find any information about pretraining from scratch. Please let me know if you have any info on that.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kamalkraj/ALBERT-TF2.0/issues/7?email_source=notifications&email_token=AIEAE4GRKASNGUXZDS3YNHTQTIKAHA5CNFSM4JL5NT22YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HYSKS7A, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIEAE4DRWCY7J3ZIGU5EFVDQTIKAHANCNFSM4JL5NT2Q .

hairzooc commented 5 years ago

Thanks for your reply. :) Taking 1 week is not a problem for me and I have 8 x TITAN RTX 24GB for now.

pohanchi commented 5 years ago

Wow!! Great

On Tue, Nov 12, 2019 at 11:52 hairzooc notifications@github.com wrote:

Thanks for your reply. :) Taking 1 week is not a problem for me and I have 8 x TITAN RTX 24GB for now.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/kamalkraj/ALBERT-TF2.0/issues/7?email_source=notifications&email_token=AIEAE4G6YDU6KKLSLR5SMTTQTISBVA5CNFSM4JL5NT22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDY6COQ#issuecomment-552722746, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIEAE4B5Y5KFYMNP67RHJDTQTISBVANCNFSM4JL5NT2Q .

kamalkraj commented 5 years ago

@hairzooc https://github.com/kamalkraj/ALBERT-TF2.0/blob/master/pretraining.md

akashicMarga commented 4 years ago

@hairzooc hi, were you able to train your model? how much time it took and how was its performance?

kamalkraj / ALBERT-TF2.0

Pretraining from scatch #7