google-research / bleurt

BLEURT is a metric for Natural Language Generation based on transfer learning.
https://arxiv.org/abs/2004.04696
Apache License 2.0
697 stars 85 forks source link

Error in finetuning BLEURT #45

Closed SaiKeshav closed 2 years ago

SaiKeshav commented 2 years ago

Thank you for the great work and for open-sourcing it!

I am trying to follow the instructions in https://github.com/google-research/bleurt/blob/master/checkpoints.md#from-an-existing-bleurt-checkpoint to fine-tune the BLEURT-20 model on a customized set of ratings.

However, when I run the suggested command,

python -m bleurt.finetune \
  -train_set=../data/ratings_train.jsonl \
  -dev_set=../data/ratings_dev.jsonl \
  -num_train_steps=500 \
  -model_dir=../models/bleurt-20-fine1 \
  -init_bleurt_checkpoint=../models/BLEURT-20/

I get the following issue:

ValueError: Shape of variable bert/embeddings/LayerNorm/beta:0 ((1152,)) doesn't match with shape of tensor bert/embeddings/LayerNorm/beta ([256]) from checkpoint reader.

I have checked this with both tensorflow 2.7 and 1.15

Any help related to this would be appreciated!

SaiKeshav commented 2 years ago

The code works perfectly well for the previous checkpoint bleurt-base-128 but seems to fail for the latest checkpoint of BLEURT-20. So the error may be a result of some compatibility issues between the code and the newly trained model?

tsellam commented 2 years ago

Hi, thanks a lot for your feedback! Unfortunately this is correct - the fine-tuning library does not work with the latest checkpoints, as it relies on BERT-specific code (BLEURT-20 uses RemBERT, not BERT). We will update the doc to reflect this.

powerpuffpomelo commented 2 years ago

Thanks for the great work! I'm trying to finetune BLEURT-20 and encountering the same problem, wondering if this issue has been solved? Or where can I modify to fix it? Any help would be appreciated~