huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.87k stars 27.2k forks source link

Loading mT5 checkpoint will load from UMT5 class #24662

Closed MattYoon closed 1 year ago

MattYoon commented 1 year ago

System Info

Who can help?

@ArthurZucker

Information

Tasks

Reproduction

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained('google/mt5-small')
print(type(model))
#transformers.models.umt5.modeling_umt5.UMT5ForConditionalGeneration

Expected behavior

@ArthurZucker Thank you for the recent integration of umT5. However, from the latest branch of transformers, loading normal mT5 will load from UMT5 class. Of course this does not happen with 4.30.2.

ydshieh commented 1 year ago

cc @ArthurZucker

ArthurZucker commented 1 year ago

Hey! Indeed one of our CI test is failing because of that. Looking into it now!

ArthurZucker commented 1 year ago

Yep, the issue is that in the CONFIG_MAPPING_NAMES umt5 maps to mt5 (since they have the same configuration file). This is messing with the overall mapping. A custom coming has to be create, or find a way to properly update! 😉

ydshieh commented 1 year ago

Hmm. The values in CONFIG_MAPPING(_NAMES) is used as keys when creating MODEL_MAPPING. We should remove the entries of umt5 in CONFIG_MAPPING_NAMES and other mappings.

Those models should be loaded in a non-auto way.

ArthurZucker commented 1 year ago

We can't just remove every mapping, some of our checks and doc require them. Let's just add a config for UMT5.