Closed alex96k closed 1 year ago
Thanks for the clear reproducer. Looking at the code, it looks like FSMT in general does not properly support the resize_token_embeddings
API: it's not using the same config names for the vocab size (easily fixable) but also the method resizes both the encoder and decoder embeddings and in this case, it should only resize the encoder embedding probably.
In any case, I don't know the model as well as @stas00 so let's wait for him to chime in and advise on the best fix!
@alex96k, would you by chance would like to tackle that?
The main difficulty with FSMT is that it has 2 unique dictionaries for many models, so some generic functionality is either not possible out of the box or requires some very careful thinking in order not to break other things. I think it's the only model of this kind in HF models.
There is an outstanding PR that was trying to bring FSMT in sync with the rest of the models: https://github.com/huggingface/transformers/pull/11218 but it proved to cause a speed regression so it was never merged, but perhaps it had this resolved already?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Environment info:
@stas00
Who can help?
@stas00
Expected behavior / Issue
I am having issues to reload a saved FSMT model when the token embedding has been resized. This error doesn't appear with other models such as T5 or MT5. The similar error occured previously for other models as well but has been fixed (-> #9055 or #8706). However it doesn't seem to be fixed for the FSMT model. Currently I receive the following error:
Any idea how to solve this? Thanks a lot and all the best!
Reproduction