Can't evaluate - Githubissues

JonasGeiping / cramming

Cramming the training of a (BERT-type) language model into limited compute.

MIT License

1.29k stars 100 forks source link

Can't evaluate #16

Closed TahaBinhuraib closed 1 year ago

TahaBinhuraib commented 1 year ago

    tokenizer, cfg_arch, model_file = cramming.utils.find_pretrained_checkpoint(cfg)
  File "/home/tahabinhuraib/cramming/cramming/utils.py", line 177, in find_pretrained_checkpoint
    all_checkpoints = [f for f in os.listdir(local_checkpoint_folder)]
FileNotFoundError: [Errno 2] No such file or directory: '/home/tahabinhuraib/cramming/outputs/bert-finetuning/checkpoints'

TahaBinhuraib commented 1 year ago

Looking at the outputs, I can't see an outputs/bert-finetuning/checkpoints directory. Only an outputs/bert/checkpoints directory

JonasGeiping commented 1 year ago

Hi, it's hard to say what is going wrong here without more details. What are the commands you used for pretraining/finetuning? output/bert and outputs/bert-finetuning are both folders that are named based on your commands.

TahaBinhuraib commented 1 year ago

@JonasGeiping I'm using this command for pretrain: python pretrain.py name=bert data=bookcorpus-wikipedia arch=bert-original train=bert-original Then I'm using this command to finetune/evaluate the model: python eval.py dryrun=True eval=GLUE name=bert-finetuning Which I'm assuming should take the latest checkpoint after pertaining? I was able to evaluate after setting eval.checkpoint='directory-of-pretrained-model' But the scores are pretty low. Also, Training on this command: python pretrain.py name=bert data=bookcorpus-wikipedia arch=bert-original train=bert-original What is the loss value I should expect, I'm not able to go below 6.5. Thank you very much for your help.

JonasGeiping commented 1 year ago

Ok, this is helpful, here are a few things to consider: 1) The evaluation checks for a pretraining model with a matching name, and then takes the latest checkpoint by default (but this can be changed in the eval settings via eval.checkpoint=...). So, eval.checkpoint does not control the checkpoint, only the checkpoint settings. 2) If you are evaluating with dryrun=True, then evaluation numbers will not be sensible, this is just a test setting. 3) You are pretraining the original bert model with the original bert training recipe. This won't get very far within the default budget setting, and might not break below 6.5 loss.

Sorry, if the documentation is a bit confusing here. The arguments to check a huggingface checkpoint (on which you are basing your test), are different from the layout used to evaluate local models.

TahaBinhuraib commented 1 year ago

Thank you very much 🙏