jerryji1993 / DNABERT

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome
https://doi.org/10.1093/bioinformatics/btab083
Apache License 2.0
578 stars 156 forks source link

Error when loading pretrained model for finetunning from a checkpoint of the pretrained model #65

Open danarte opened 2 years ago

danarte commented 2 years ago

Hi, Very simple issue, this error: "ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group" Is displayed when I'm trying to load a a pre-trained model for finetunning while doing so from a checkpoint folder located in the pretrained model's folder. When loading the model from the "root" folder of the pre-trained model (which contains the checkpoints) the finetunning is performed fine. The error is thrown before the start of the training.

To reproduce, simply follow the steps in the example in README.md including the pretraining (just set to lower number of epochs) and then for the finetunning at step 3.3 set the path of the model to one of the checkpoint folders like: If for the pretraining the output folder was set to: export OUTPUT_PATH=output$KMER Then for finetunning set: export MODEL_PATH=output$KMER/checkpoint-1800/

dominiclopez391 commented 10 months ago

Hello,

I'm having this same issue. I think there's a problem fine-tuning DNABERT from a checkpoint rather than training from a completed loop. Have you found an issue to this? What do you mean by loading the model from the "root" folder of the pre-trained model? Are you referring to the given sample pre-trained models?

Update:

Delete optimizer.pt, scheduler.pt, and training_args.bin from the checkpoint to fine-tune from checkpoint