huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.65k stars 26.93k forks source link

Unable to load weights from pytorch checkpoint file #4336

Closed manueltonneau closed 4 years ago

manueltonneau commented 4 years ago

🐛 Bug

Information

I uploaded two models this morning using the transformers-cli. The models can be found on my huggingface page. The folder I uploaded for both models contained a PyTorch model in bin format, a zip file containing the three TF model files, the config.json and the vocab.txt. The PT model was created from TF checkpoints using this code. I'm able to download the tokenizer using:

tokenizer = AutoTokenizer.from_pretrained("mananeau/clinicalcovid-bert-base-cased").

Yet, when trying to download the model using:

model = AutoModel.from_pretrained("mananeau/clinicalcovid-bert-base-cased")

I am getting the following error:


AttributeError Traceback (most recent call last) ~/anaconda3/lib/python3.7/site-packages/torch/serialization.py in _check_seekable(f) 226 try: --> 227 f.seek(f.tell()) 228 return True

AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last) ~/anaconda3/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 625 try: --> 626 state_dict = torch.load(resolved_archive_file, map_location="cpu") 627 except Exception:

~/anaconda3/lib/python3.7/site-packages/torch/serialization.py in load(f, map_location, pickle_module, pickle_load_args) 425 pickle_load_args['encoding'] = 'utf-8' --> 426 return _load(f, map_location, pickle_module, pickle_load_args) 427 finally:

~/anaconda3/lib/python3.7/site-packages/torch/serialization.py in _load(f, map_location, pickle_module, **pickle_load_args) 587 --> 588 _check_seekable(f) 589 f_should_read_directly = _should_read_directly(f)

~/anaconda3/lib/python3.7/site-packages/torch/serialization.py in _check_seekable(f) 229 except (io.UnsupportedOperation, AttributeError) as e: --> 230 raise_err_msg(["seek", "tell"], e) 231

~/anaconda3/lib/python3.7/site-packages/torch/serialization.py in raise_err_msg(patterns, e) 222 " try to load from it instead.") --> 223 raise type(e)(msg) 224 raise e

AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last)

in ----> 1 model = AutoModel.from_pretrained("mananeau/clinicalcovid-bert-base-cased") ~/anaconda3/lib/python3.7/site-packages/transformers/modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 425 for config_class, model_class in MODEL_MAPPING.items(): 426 if isinstance(config, config_class): --> 427 return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs) 428 raise ValueError( 429 "Unrecognized configuration class {} for this kind of AutoModel: {}.\n" ~/anaconda3/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 627 except Exception: 628 raise OSError( --> 629 "Unable to load weights from pytorch checkpoint file. " 630 "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. " 631 ) OSError: Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. ## Environment info - `transformers` version: 2.9.0 - Platform: Ubuntu 18.04 - Python version: 3.7.4 - PyTorch version (GPU?): 1.3.1 - Tensorflow version (GPU?): 1.14.0 - Using GPU in script?: No - Using distributed or parallel set-up in script?: No
wicebing commented 4 years ago

I had this problem today,too. I create a new container as my new gpu environment, but cannot load any pretrained due to this error, but the same load pretrained codes are normal run on my old enviroment to download the pretrained

LysandreJik commented 4 years ago

Hi @mananeau, when I look on the website and click on "show all files" for your model, it only lists the configuration and vocabulary. Have you uploaded the model file?

manueltonneau commented 4 years ago

I believe I did. It can be found under this link. Also, when doing transformers-cli s3 ls, I get this output: image

elyesmanai commented 4 years ago

I have this problem too, and I do have all my files 0

wicebing commented 4 years ago

For this problem I switched to pip install with the repository of tranformers=2.8 which had been download in my old environment.

It normal works to download and load any pretrained weight

I don't know why, but it's work

manueltonneau commented 4 years ago

I switched to pip install with the repository of tranformers=2.8 which had been download in my old environment.

I cannot confirm this on my end. Tried with transformers==2.8.0 and still getting the same error.

julien-c commented 4 years ago

@mananeau We could make it clearer/more validated, but the upload CLI is meant to use only for models/tokenizers saved using the .save_pretrained() method.

In particular here, your model file should be named pytorch_model.bin

D-i-l-r-u-k-s-h-i commented 4 years ago

For this problem I switched to pip install with the repository of tranformers=2.8 which had been download in my old environment.

It normal works to download and load any pretrained weight

I don't know why, but it's work

This worked for me too

fathia-ghribi commented 3 years ago

hi !! When i try this code explore_model.ipynb from https://github.com/sebkim/lda2vec-pytorch, the following error occurs . image how to resolve it ?? someone help me plz

LysandreJik commented 3 years ago

@fathia-ghribi this is unrelated to the issue here. Please open a new issue and fill out the issue template so that we may help you. On a second note, your error does not seem to be related to this library.

brent-lemieux commented 3 years ago

I had this problem when I trained the model with torch==1.6.0 and tried to load the model with 1.3.1. The issue was fixed by upgrading to 1.6.0 in my environment where I'm loading the model.

da-head0 commented 3 years ago

Had same error on torch==1.8.1 and simpletransfomers==0.61.4 downgrading torch or simpletransfomers doesn't work for me, because the issue caused by the file - not properly downloaded.

I solved this issue with git clone my model on local, or upload model files on google drive and change directory.

model = T5Model("mt5", "/content/drive/MyDrive/dataset/outputs", 
                args=model_args, use_cuda=False, from_tf=False, force_download=True)
mrelmi commented 3 years ago

For this problem I switched to pip install with the repository of tranformers=2.8 which had been download in my old environment.

It normal works to download and load any pretrained weight

I don't know why, but it's work

thank youuuuuu

marcosbodio commented 2 years ago

I had the same problem with:

sentence-transformers              2.2.0
transformers                       4.17.0
torch                              1.8.1
torchvision                        0.4.2

Python 3.7.6

I solved it by upgrading torch with pip install --upgrade torch torchvision. Now working with

sentence-transformers              2.2.0
transformers                       4.17.0
torch                              1.10.2
torchvision                        0.11.3

Python 3.7.6
BitnaKeum commented 2 years ago

In my case, there was something problem during moving the files, so pytorch_model.bin file existed but the size was 0 byte. After replacing it with correct file, the error removed.

foton263 commented 2 years ago

Just delete the corrupted cached files and rerun your code; it will work.

ShineYull commented 1 year ago

Just delete the corrupted cached files and rerun your code; it will work.

Yes, it works for me

tothandor commented 1 year ago

I had to downgrade from torch 2.0.1 to 1.13.1.

zzd2001 commented 1 year ago

does it work? I had the same problem

JaosonMa commented 1 year ago

yes, it works, i delete all the .cache files,then redownload ,Error gone