allenai / scifact

Data and models for the SciFact verification task.
Other
215 stars 24 forks source link

TypeError: batch_text_or_text_pairs has to be a list (got <class 'zip'>) #22

Closed pritamdeka closed 1 year ago

pritamdeka commented 2 years ago

Hi I tried to train a roberta-base model using the provided train file. However, I am getting a TypeError: batch_text_or_text_pairs has to be a list (got <class 'zip'>).

Traceback (most recent call last): File "/content/scifact/verisci/training/label_prediction/transformer_scifact.py", line 144, in encoded_dict = encode(batch['claim'], batch['rationale']) File "/content/scifact/verisci/training/label_prediction/transformer_scifact.py", line 109, in encode return_tensors='pt') File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 2570, in batch_encode_plus *kwargs, File "/usr/local/lib/python3.7/dist-packages/transformers/models/gpt2/tokenization_gpt2_fast.py", line 163, in _batch_encode_plus return super()._batch_encode_plus(args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_fast.py", line 394, in _batch_encode_plus raise TypeError(f"batch_text_or_text_pairs has to be a list (got {type(batch_text_or_text_pairs)})") TypeError: batch_text_or_text_pairs has to be a list (got <class 'zip'>)

Would be glad if you could suggest any idea how to fix this. Thanks

dwadden commented 2 years ago

Hi,

Can you paste in the command that you're issuing to generate this error? Also, can you confirm that you followed the README in terms of setting up a virtual environment and installing dependencies?