Closed TahaBinhuraib closed 1 year ago
Looking at the outputs, I can't see an outputs/bert-finetuning/checkpoints directory. Only an outputs/bert/checkpoints directory
Hi, it's hard to say what is going wrong here without more details. What are the commands you used for pretraining/finetuning? output/bert and outputs/bert-finetuning are both folders that are named based on your commands.
@JonasGeiping I'm using this command for pretrain: python pretrain.py name=bert data=bookcorpus-wikipedia arch=bert-original train=bert-original
Then I'm using this command to finetune/evaluate the model:
python eval.py dryrun=True eval=GLUE name=bert-finetuning
Which I'm assuming should take the latest checkpoint after pertaining?
I was able to evaluate after setting eval.checkpoint='directory-of-pretrained-model'
But the scores are pretty low.
Also, Training on this command: python pretrain.py name=bert data=bookcorpus-wikipedia arch=bert-original train=bert-original
What is the loss value I should expect, I'm not able to go below 6.5.
Thank you very much for your help.
Ok, this is helpful, here are a few things to consider:
1) The evaluation checks for a pretraining model with a matching name
, and then takes the latest checkpoint by default (but this can be changed in the eval settings via eval.checkpoint=...
). So, eval.checkpoint
does not control the checkpoint, only the checkpoint settings.
2) If you are evaluating with dryrun=True
, then evaluation numbers will not be sensible, this is just a test setting.
3) You are pretraining the original bert model with the original bert training recipe. This won't get very far within the default budget setting, and might not break below 6.5 loss.
Sorry, if the documentation is a bit confusing here. The arguments to check a huggingface checkpoint (on which you are basing your test), are different from the layout used to evaluate local models.
Thank you very much 🙏