richarddwang / electra_pytorch

Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)
324 stars 41 forks source link

How do I continue language model training? #25

Closed PhilipMay closed 3 years ago

PhilipMay commented 3 years ago

Hi, I have an pretrained ELECTRA generator and discriminator stored on disk. Both trained on a large corpus. Now I want to train it on a domainspecific corpus.

To do that I am loading them from disk by adding .from_pretrained() here:

https://github.com/richarddwang/electra_pytorch/blob/ab29d03e69c6fb37df238e653c8d1a81240e3dd6/pretrain.py#L364-L365

My question is: Why do you exactly do this:

https://github.com/richarddwang/electra_pytorch/blob/ab29d03e69c6fb37df238e653c8d1a81240e3dd6/pretrain.py#L366-L367

and do I still need that in my case or does it " destroy" my pretrained generator and discriminator?

Many thanks Philip

richarddwang commented 3 years ago

Nice to see you have done something !

Why do you exactly do this

I am tying parameters here.

do I still need that in my case

I recommend keeping it.

does it " destroy" my pretrained generator and discriminator?

No

To be detailed: image

Be free to tag me if you still have questions.

PhilipMay commented 3 years ago

Ahh I see - tying means you do something like in a siamese network - right?

richarddwang commented 3 years ago

Yeah, both share parameters.