microsoft / DeBERTa

The implementation of DeBERTa
MIT License
1.96k stars 222 forks source link

DeBERTa v2: loading example code from huggingface: TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType #32

Closed youssefavx closed 3 years ago

youssefavx commented 3 years ago

On Colab, I did:

!pip install transformers

Then:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("microsoft/deberta-xxlarge-v2")

model = AutoModel.from_pretrained("microsoft/deberta-xxlarge-v2")

But I got this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-21a010bd2ed3> in <module>()
      1 from transformers import AutoTokenizer, AutoModel
      2 
----> 3 tokenizer = AutoTokenizer.from_pretrained("microsoft/deberta-xxlarge-v2")
      4 
      5 model = AutoModel.from_pretrained("microsoft/deberta-xxlarge-v2")

4 frames
/usr/local/lib/python3.6/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
    386             else:
    387                 if tokenizer_class_py is not None:
--> 388                     return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
    389                 else:
    390                     raise ValueError(

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
   1767 
   1768         return cls._from_pretrained(
-> 1769             resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
   1770         )
   1771 

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils_base.py in _from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs)
   1839         # Instantiate tokenizer.
   1840         try:
-> 1841             tokenizer = cls(*init_inputs, **init_kwargs)
   1842         except OSError:
   1843             raise OSError(

/usr/local/lib/python3.6/dist-packages/transformers/models/deberta/tokenization_deberta.py in __init__(self, vocab_file, do_lower_case, unk_token, sep_token, pad_token, cls_token, mask_token, **kwargs)
    540         )
    541 
--> 542         if not os.path.isfile(vocab_file):
    543             raise ValueError(
    544                 "Can't find a vocabulary file at path '{}'. To load the vocabulary from a Google pretrained "

/usr/lib/python3.6/genericpath.py in isfile(path)
     28     """Test whether a path is a regular file"""
     29     try:
---> 30         st = os.stat(path)
     31     except OSError:
     32         return False

TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
BigBird01 commented 3 years ago

V2 is not supported by HF transformers yet. We are working on it, hopefully, it will be ready next week.

youssefavx commented 3 years ago

Awesome! Thank you