Closed juand-r closed 3 years ago
Hey @juand-r,
Thanks for the issue! I think this problem should be solved by now. We have done some major refactoring for MBart and removed the _keys_to_ignore_on_save
for MBart. Can you check whether the error persists on current master? We will do a release tomorrow probably so that the fix should be included in the next pip version :-)
Thanks, @patrickvonplaten !
I just checked the error is gone when using version 4.2.1.
Hey @juand-r ,
I am also trying to fine tune mBART for some non English corpus. Is there any sample script that I can follow for this task?
Hi @ozcangundes,
This could be helpful: https://github.com/GEM-benchmark/GEM-baseline-models/blob/main/examples/mbart_large_mlsum_ru.ipynb
Hey @juand-r ,
I am also trying to fine tune mBART for some non English corpus. Is there any sample script that I can follow for this task?
Environment info
transformers
version: 4.1.1Who can help
@patrickvonplaten
Information
I am fine-tuning mBART-large on MLSUM (Spanish, and also Russian). However, I noticed two things:
BartLearnedPositionalEmbedding
, for both encoder and decoder).I noticed that the mBART config includes:
and likewise for
keys_to_ignore_on_load_missing
. I suppose this was done in response to issue #7296. This would be fine if the mBART position embeddings were static, but they seem to be learned. The mbart configuration showsstatic_position_embeddings = False
.I can load and save the mBART model correctly if I set the following before fine-tuning:
The problem arises when using:
The tasks I am working on is:
Abstractive summarization.
To reproduce
Steps to reproduce the behavior:
mbart_model = AutoModelForSeq2SeqLM.from_pretrained("facebook/mbart-large-cc25")
load_best_model_at_end=True
.mbart_model._keys_to_ignore_on_load_missing = None
andmbart_model._keys_to_ignore_on_save = None
fixes the problem (the full model is saved, and the checkpoints are correct).Expected behavior
The model's position embeddings and generated outputs should be exactly the same after saving it and loading from disk.