google-research / electra

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Apache License 2.0
2.31k stars 351 forks source link

Using own data to continue pre-training from the released ELECTRA checkpoints #97

Open ghost opened 3 years ago

ghost commented 3 years ago

Hello,

I would like to pretrain using my own corpus but starting from the released Electra model weights. In the Readme file, it says I should download the pretrained weights into $DATA_DIR/electra_small if I am to use the small model.

Is this a typo?

I think I should download the weights into $DATA_DIR/models/electra_small instead, so that it will look into models/ folder to continue training when I run the run_pretraining.py.

If I follow the Readme file, downloading weights into $DATA_DIR/electra_small, I am actually pretraining the small model from scratch using my own data, with small model structure. Is my understanding correct?

Thank you in advance for the advice.

hyusterr commented 3 years ago

same question here, I try to pretrain from original electra small model weights, but i get ERROR:tensorflow:Error recorded from training_loop: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key

xieexiaotuzi commented 2 years ago

I have the same problem

Joseph-Vineland commented 2 years ago

I am also struggling to pretrain starting from a pre-built model. The instructions in the README are not working.

JiazhaoLi commented 1 year ago

I am also struggling to pretrain starting from a pre-built model. The instructions in the README are not working.

Agree. I think the instructions in the README only indicate that you can train more steps from scratch. Any ideas on continuing training from the previous checkpoints?
Thank you for any hints and ideas.