mikeizbicki / cmc-csci181-deeplearning

deep learning course materials
15 stars 6 forks source link

Running transformers_tutorial.py #40

Open n8stringham opened 4 years ago

n8stringham commented 4 years ago

I've copied over the transformers_tutorial.py file to my computer, but can't seem to get it to run. After commenting out the first crash on line 38 and running I get the following error message:

  File "transformers_tutorial.py", line 50, in <module>
    return_tensors = 'pt',
  File "/anaconda3/envs/dl/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 786, in encode_plus
    first_ids = get_input_ids(text)
  File "/anaconda3/envs/dl/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 778, in get_input_ids
    return self.convert_tokens_to_ids(self.tokenize(text, **kwargs))
  File "/anaconda3/envs/dl/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 649, in tokenize
    tokenized_text = split_on_tokens(added_tokens, text)
  File "/anaconda3/envs/dl/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 646, in split_on_tokens
    else [token] for token in tokenized_text), [])
  File "/anaconda3/envs/dl/lib/python3.7/site-packages/transformers/tokenization_utils.py", line 646, in <genexpr>
    else [token] for token in tokenized_text), [])
TypeError: _tokenize() got an unexpected keyword argument 'pad_to_max_length' 

Looks like it has to do with the parameters for the tokenizer.encode_plus() function, but I'm not sure how to fix this error.

mikeizbicki commented 4 years ago

My guess is that you have an older version of transformers installed that uses a different interface for that function. Here's the version info I used to run the code:

$ python3
>>> import transformers
>>> import torch
>>> import tensorboard
>>> transformers.__version__
'2.8.0'
>>> torch.__version__
'1.5.0'
>>> tensorboard.__version__
'2.2.1'
n8stringham commented 4 years ago

Yep, I just had an older version of transformers. Thanks!