UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.78k stars 2.43k forks source link

SentenceTransformer unable to load weights from pytorch checkpoint file #830

Open JoyeBright opened 3 years ago

JoyeBright commented 3 years ago

Hi there, Shout-out to you and your team for creating Sentence-BERT. Good job!

I had no problem using Sentence-BERT on Google Colab.

But when I moved codes to University's GPU servers, they're not working properly, and throw the following OSError:

OSError: Unable to load weights from pytorch checkpoint file for '/home/X/.cache/torch/sentence_transformers/sbert.net_models_xlm-r-100langs-bert-base-nli-stsb-mean-tokens/0_Transformer' at '/home/X/.cache/torch/sentence_transformers/sbert.net_models_xlm-r-100langs-bert-base-nli-stsb-mean-tokens/0_Transformer/pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

You can find logging info below:

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: xlm-r-100langs-bert-base-nli-stsb-mean-tokens INFO:sentence_transformers.SentenceTransformer:Did not find folder xlm-r-100langs-bert-base-nli-stsb-mean-tokens INFO:sentence_transformers.SentenceTransformer:Search model on server: http://sbert.net/models/xlm-r-100langs-bert-base-nli-stsb-mean-tokens.zip INFO:sentence_transformers.SentenceTransformer:Load SentenceTransformer from folder: /home/X/.cache/torch/sentence_transformers/sbert.net_models_xlm-r-100langs-bert-base-nli-stsb-mean-tokens

As logs indicate, the problem should be concerned with downloading and loading the intended pre-trained into/ from this folder: /home/X/.cache/torch/sentence_transformers/sbert.net_models_xlm-r-100langs-bert-base-nli-stsb-mean-tokens/0_Transformer/pytorch_model.bin

I tried to look at this, but could not grasp what's wrong with that.

Any idea to address the issue?

JoyeBright commented 3 years ago

Here you can find a full version of error:


 ---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/home/X/.local/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   1061             try:
-> 1062                 state_dict = torch.load(resolved_archive_file, map_location="cpu")
   1063             except Exception:

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    526         if _is_zipfile(opened_file):
--> 527             with _open_zipfile_reader(f) as opened_zipfile:
    528                 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in __init__(self, name_or_buffer)
    223     def __init__(self, name_or_buffer):
--> 224         super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
    225 

RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7fcf63dda193 in /usr/local/lib/python3.7/dist-packages/torch/lib/libc10.so)
frame #1: caffe2::serialize::PyTorchStreamReader::init() + 0x1f5b (0x7fcf147949eb in /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch.so)
frame #2: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::string const&) + 0x64 (0x7fcf14795c04 in /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x6c6536 (0x7fcf6ca75536 in /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x295a74 (0x7fcf6c644a74 in /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_python.so)
frame #5: _PyMethodDef_RawFastCallDict + 0x12b (0x5ccffb in /usr/bin/python3)
frame #6: /usr/bin/python3() [0x4c9167]
frame #7: PyObject_Call + 0x56 (0x5d0986 in /usr/bin/python3)
frame #8: /usr/bin/python3() [0x5856d8]
frame #9: _PyObject_FastCallKeywords + 0x129 (0x5ce0e9 in /usr/bin/python3)
frame #10: /usr/bin/python3() [0x53ed81]
frame #11: _PyEval_EvalFrameDefault + 0x49ba (0x54653a in /usr/bin/python3)
frame #12: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #13: _PyFunction_FastCallDict + 0x34e (0x5ceb8e in /usr/bin/python3)
frame #14: /usr/bin/python3() [0x585543]
frame #15: _PyObject_FastCallKeywords + 0x129 (0x5ce0e9 in /usr/bin/python3)
frame #16: _PyEval_EvalFrameDefault + 0x4c3b (0x5467bb in /usr/bin/python3)
frame #17: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #18: _PyFunction_FastCallKeywords + 0x482 (0x5cd982 in /usr/bin/python3)
frame #19: /usr/bin/python3() [0x53ebb0]
frame #20: _PyEval_EvalFrameDefault + 0x13a9 (0x542f29 in /usr/bin/python3)
frame #21: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #22: _PyFunction_FastCallDict + 0x34e (0x5ceb8e in /usr/bin/python3)
frame #23: /usr/bin/python3() [0x4c9202]
frame #24: PyObject_Call + 0x56 (0x5d0986 in /usr/bin/python3)
frame #25: _PyEval_EvalFrameDefault + 0x18d2 (0x543452 in /usr/bin/python3)
frame #26: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #27: _PyFunction_FastCallKeywords + 0x482 (0x5cd982 in /usr/bin/python3)
frame #28: /usr/bin/python3() [0x53ebb0]
frame #29: _PyEval_EvalFrameDefault + 0x13a9 (0x542f29 in /usr/bin/python3)
frame #30: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #31: _PyFunction_FastCallDict + 0x34e (0x5ceb8e in /usr/bin/python3)
frame #32: /usr/bin/python3() [0x585543]
frame #33: /usr/bin/python3() [0x58d581]
frame #34: PyObject_Call + 0x56 (0x5d0986 in /usr/bin/python3)
frame #35: _PyEval_EvalFrameDefault + 0x18d2 (0x543452 in /usr/bin/python3)
frame #36: _PyFunction_FastCallKeywords + 0x18c (0x5cd68c in /usr/bin/python3)
frame #37: /usr/bin/python3() [0x53ebb0]
frame #38: _PyEval_EvalFrameDefault + 0x49ba (0x54653a in /usr/bin/python3)
frame #39: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #40: _PyFunction_FastCallDict + 0x34e (0x5ceb8e in /usr/bin/python3)
frame #41: /usr/bin/python3() [0x585543]
frame #42: _PyObject_FastCallKeywords + 0x129 (0x5ce0e9 in /usr/bin/python3)
frame #43: /usr/bin/python3() [0x53ed81]
frame #44: _PyEval_EvalFrameDefault + 0x13a9 (0x542f29 in /usr/bin/python3)
frame #45: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #46: /usr/bin/python3() [0x54d517]
frame #47: _PyMethodDef_RawFastCallKeywords + 0x1b3 (0x5cccc3 in /usr/bin/python3)
frame #48: _PyEval_EvalFrameDefault + 0x4863 (0x5463e3 in /usr/bin/python3)
frame #49: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #50: _PyFunction_FastCallKeywords + 0x482 (0x5cd982 in /usr/bin/python3)
frame #51: _PyEval_EvalFrameDefault + 0x732 (0x5422b2 in /usr/bin/python3)
frame #52: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #53: _PyFunction_FastCallKeywords + 0x482 (0x5cd982 in /usr/bin/python3)
frame #54: /usr/bin/python3() [0x53ebb0]
frame #55: _PyEval_EvalFrameDefault + 0x13a9 (0x542f29 in /usr/bin/python3)
frame #56: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #57: _PyFunction_FastCallDict + 0x34e (0x5ceb8e in /usr/bin/python3)
frame #58: /usr/bin/python3() [0x4c9202]
frame #59: PyObject_Call + 0x56 (0x5d0986 in /usr/bin/python3)
frame #60: _PyEval_EvalFrameDefault + 0x18d2 (0x543452 in /usr/bin/python3)
frame #61: _PyEval_EvalCodeWithName + 0x252 (0x53f732 in /usr/bin/python3)
frame #62: _PyFunction_FastCallKeywords + 0x482 (0x5cd982 in /usr/bin/python3)
frame #63: /usr/bin/python3() [0x53ebb0]

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
<ipython-input-39-7ed38c718236> in <module>()
----> 1 model = SentenceTransformer(model_name_or_path="xlm-r-100langs-bert-base-nli-stsb-mean-tokens", device = 'cuda')

/home/X/.local/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py in __init__(self, model_name_or_path, modules, device)
    119                 for module_config in contained_modules:
    120                     module_class = import_from_string(module_config['type'])
--> 121                     module = module_class.load(os.path.join(model_path, module_config['path']))
    122                     modules[module_config['name']] = module
    123 

/home/X/.local/lib/python3.7/site-packages/sentence_transformers/models/Transformer.py in load(input_path)
    109         with open(sbert_config_path) as fIn:
    110             config = json.load(fIn)
--> 111         return Transformer(model_name_or_path=input_path, **config)
    112 
    113 

/home/X/.local/lib/python3.7/site-packages/sentence_transformers/models/Transformer.py in __init__(self, model_name_or_path, max_seq_length, model_args, cache_dir, tokenizer_args, do_lower_case)
     26 
     27         config = AutoConfig.from_pretrained(model_name_or_path, **model_args, cache_dir=cache_dir)
---> 28         self.auto_model = AutoModel.from_pretrained(model_name_or_path, config=config, cache_dir=cache_dir)
     29         self.tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, cache_dir=cache_dir, **tokenizer_args)
     30 

/home/X/.local/lib/python3.7/site-packages/transformers/models/auto/modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    813         if type(config) in MODEL_MAPPING.keys():
    814             return MODEL_MAPPING[type(config)].from_pretrained(
--> 815                 pretrained_model_name_or_path, *model_args, config=config, **kwargs
    816             )
    817         raise ValueError(

/home/X/.local/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   1063             except Exception:
   1064                 raise OSError(
-> 1065                     f"Unable to load weights from pytorch checkpoint file for '{pretrained_model_name_or_path}' "
   1066                     f"at '{resolved_archive_file}'"
   1067                     "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. "

OSError: Unable to load weights from pytorch checkpoint file for '/home/X/.cache/torch/sentence_transformers/sbert.net_models_xlm-r-100langs-bert-base-nli-stsb-mean-tokens/0_Transformer' at '/home/X/.cache/torch/sentence_transformers/sbert.net_models_xlm-r-100langs-bert-base-nli-stsb-mean-tokens/0_Transformer/pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
nreimers commented 3 years ago

Your pytorch version is too old for the selected model. Update to 1.6 or newer.

JoyeBright commented 3 years ago

ThX; I did, wondering why it only downloads small pre-trained models – stsb-distilbert-base, for instance!