Closed PosoSAgapo closed 4 years ago
Plus:I have seen a similiar issue in this project,however the problem in that issue is that he did not input the right pretrain_weights.But I do not think that will be the solution in here
Similiarly,I aslo tried DistilBert,Roberta,XLMRoberta,these 3 models also cannot work for me,the error message is the same as the one I described above.
I just tried this and cannot reproduce the behaviour that you indicate. Are you running this from a notebook? Try restarting your kernel and running it again.
I just tried this and cannot reproduce the behaviour that you indicate. Are you running this from a notebook? Try restarting your kernel and running it again.
I run this programme on the linux GPU server,I tried restarting the python programme,however,the problem is still exsiting.Would this be the problem of downloading the model?
No. UnboundLocalError simply means that Python hasn't seen this variable before, which cannot occur in your code snippet. If the models were downloaded incorrectly, you'd get another error. Even if the tokenizer
was initialized as None
you'd get another error.
Are you sure that is your only code that is running? Please pos the full trace.
No. UnboundLocalError simply means that Python hasn't seen this variable before, which cannot occur in your code snippet. If the models were downloaded incorrectly, you'd get another error. Even if the
tokenizer
was initialized asNone
you'd get another error.Are you sure that is your only code that is running? Please pos the full trace.
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "/users4/bwchen/anaconda3/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 302, in from_pretrained
return cls._from_pretrained(*inputs, **kwargs)
File "/users4/bwchen/anaconda3/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 438, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/users4/bwchen/anaconda3/lib/python3.7/site-packages/transformers/tokenization_bert.py", line 164, in __init__
"model use `tokenizer = BertTokenizer.from_pretrained(PRETRAINED_MODEL_NAME)`".format(vocab_file))
ValueError: Can't find a vocabulary file at path '/users4/bwchen/.cache/torch/transformers/37cc1eaaea18a456726fc28ecb438852f0ca1d9e7d259e6e3747ee33065936f6'. To load the vocabulary from a Google pretrained model use `tokenizer = BertTokenizer.from_pretrained(PRETRAINED_MODEL_NAME)`
I am sure that is the only code I was running at that time , I am tring to reproduce this error.This time it is working properly when the model_class goes the aforementioned 'wrong' model XLMModel. However,when the model continues to run,I met another problem when the model was the DistillBert, does this error means that I have to use BertTokenizer instead of DistillBertTokenizer?
I can also attest to this error.
I am using a Kaggle notebook, and I get this error after running this in my first cell. Most of it is default code, bottom two lines are the key ones.
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))
# Any results you write to the current directory are saved as output.
print(os.getcwd(), os.listdir())
from transformers import RobertaTokenizer
tknzr = RobertaTokenizer.from_pretrained('roberta-large')
Error thrown
UnboundLocalError Traceback (most recent call last)
<ipython-input-1-7957db35f110> in <module>
19 from transformers import RobertaTokenizer
20
---> 21 tknzr = RobertaTokenizer.from_pretrained('roberta-large')
/opt/conda/lib/python3.6/site-packages/transformers/tokenization_utils.py in from_pretrained(cls, *inputs, **kwargs)
300
301 """
--> 302 return cls._from_pretrained(*inputs, **kwargs)
303
304
/opt/conda/lib/python3.6/site-packages/transformers/tokenization_utils.py in _from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
442
443 # Save inputs and kwargs for saving and re-loading with ``save_pretrained``
--> 444 tokenizer.init_inputs = init_inputs
445 tokenizer.init_kwargs = init_kwargs
446
UnboundLocalError: local variable 'tokenizer' referenced before assignment
Kaggle runs transformers version 2.3.0 by default. After updating to 2.5.1 it worked just fine. To update on Kaggle, turn the internet option on in the settings in the right side. Then do !pip install -U transformers
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I am runnnig the example code on the homepage. However,I met this problem.
This happened when the model_class goes to the XLMModel.I do not quite understand why this happen,because this problem only occurs when the model is XLMModel.