Cannot use the pretraiend model on Hugging Face

omihub777 commented 1 year ago

Thank you for your great work! I'm trying to use your model Tanrei/GPTSAN-japanese from Hugging Face (link) on google colaboratory, but I bump into an error below. I'd appreciate it if you can elaborate on how to solve this issue. Thank you in advance!

Environment:
- Google Colaboratory
- Python 3.8.16
- transformers 4.25.1

Code:

from transformers import AutoModel, AutoTokenizer

ckpt = "Tanrei/GPTSAN-japanese"
model = AutoModel.from_pretrained(ckpt)
tokenizer = AutoTokenizer.from_pretrained(ckpt)

Error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[<ipython-input-2-7855937935be>](https://localhost:8080/#) in <module>
    2 
    3 ckpt = "Tanrei/GPTSAN-japanese"
----> 4 model = AutoModel.from_pretrained(ckpt)
    5 tokenizer = AutoTokenizer.from_pretrained(ckpt)

2 frames
[/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
  432         hub_kwargs = {name: kwargs.pop(name) for name in hub_kwargs_names if name in kwargs}
  433         if not isinstance(config, PretrainedConfig):
--> 434             config, kwargs = AutoConfig.from_pretrained(
  435                 pretrained_model_name_or_path,
  436                 return_unused_kwargs=True,

[/usr/local/lib/python3.8/dist-packages/transformers/models/auto/configuration_auto.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
  827             return config_class.from_pretrained(pretrained_model_name_or_path, **kwargs)
  828         elif "model_type" in config_dict:
--> 829             config_class = CONFIG_MAPPING[config_dict["model_type"]]
  830             return config_class.from_dict(config_dict, **unused_kwargs)
  831         else:

[/usr/local/lib/python3.8/dist-packages/transformers/models/auto/configuration_auto.py](https://localhost:8080/#) in __getitem__(self, key)
  534             return self._extra_content[key]
  535         if key not in self._mapping:
--> 536             raise KeyError(key)
  537         value = self._mapping[key]
  538         module_name = model_type_to_module_name(key)

KeyError: 'gptsan-japanese'

tanreinama commented 1 year ago

Pull Request to HuggingFace is not yet merged. So to use it, you need to download and use the source code from this repository. In addition, the instance of the free version of Google Colab cannot run due to insufficient main memory. Prepare a high memory environment.

tanreinama commented 1 year ago

If there are no further questions, I will close this issue

omihub777 commented 1 year ago

oh, my bad. it's not merged yet. thanks!

younesbelkada commented 1 year ago

As a side note you can probably decrease the memory requirement of the model by doing:

model = AutoModel.from_pretrained(ckpt, torch_dtype=torch.float16)

for loading in half-precision, we can also work on making the model 8-bit compatible to load it in 8bit

tanreinama / GPTSAN

Cannot use the pretraiend model on Hugging Face #7