Loading mT5 checkpoint will load from UMT5 class

MattYoon commented 1 year ago

System Info

transformers version: 4.31.0.dev0
Platform: Linux-5.15.0-41-generic-x86_64-with-glibc2.31
Python version: 3.9.16
Huggingface_hub version: 0.14.1
Safetensors version: 0.3.1
PyTorch version (GPU?): 2.0.0+cu117 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help?

@ArthurZucker

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained('google/mt5-small')
print(type(model))
#transformers.models.umt5.modeling_umt5.UMT5ForConditionalGeneration

Expected behavior

@ArthurZucker Thank you for the recent integration of umT5. However, from the latest branch of transformers, loading normal mT5 will load from UMT5 class. Of course this does not happen with 4.30.2.

ydshieh commented 1 year ago

cc @ArthurZucker

ArthurZucker commented 1 year ago

Hey! Indeed one of our CI test is failing because of that. Looking into it now!

ArthurZucker commented 1 year ago

Yep, the issue is that in the CONFIG_MAPPING_NAMES umt5 maps to mt5 (since they have the same configuration file). This is messing with the overall mapping. A custom coming has to be create, or find a way to properly update! 😉

ydshieh commented 1 year ago

Hmm. The values in CONFIG_MAPPING(_NAMES) is used as keys when creating MODEL_MAPPING. We should remove the entries of umt5 in CONFIG_MAPPING_NAMES and other mappings.

Those models should be loaded in a non-auto way.

ArthurZucker commented 1 year ago

We can't just remove every mapping, some of our checks and doc require them. Let's just add a config for UMT5.

huggingface / transformers