huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.93k stars 26.52k forks source link

Pegasus example not working #8691

Closed greenstars closed 3 years ago

greenstars commented 3 years ago

@patrickvonplaten

Hi,

I am trying to run on the pegasus example on Colab. "!pip install git+https://github.com/huggingface/transformers.git !pip install sentencepiece from transformers import PegasusForConditionalGeneration, PegasusTokenizer import torch

src_text = [ """ PG&E stated it scheduled the blackouts in response to forecasts for high winds amid dry conditions. The aim is to reduce the risk of wildfires. Nearly 800 thousand customers were scheduled to be affected by the shutoffs which were expected to last through at least midday tomorrow.""" ]

model_name = 'google/pegasus-xsum' torch_device = 'cuda' if torch.cuda.is_available() else 'cpu' tokenizer = PegasusTokenizer.from_pretrained(model_name) model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device) batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest').to(torch_device) translated = model.generate(**batch) tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True) assert tgt_text[0] == "California's largest electricity provider has turned off power to hundreds of thousands of customers.

Collecting git+https://github.com/huggingface/transformers.git Cloning https://github.com/huggingface/transformers.git to /tmp/pip-req-build-gvb7jrr9 Running command git clone -q https://github.com/huggingface/transformers.git /tmp/pip-req-build-gvb7jrr9 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing wheel metadata ... done Requirement already satisfied (use --upgrade to upgrade): transformers==4.0.0rc1 from git+https://github.com/huggingface/transformers.git in /usr/local/lib/python3.6/dist-packages Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (2.23.0) Requirement already satisfied: tokenizers==0.9.4 in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (0.9.4) Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (4.41.1) Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (1.18.5) Requirement already satisfied: filelock in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (3.0.12) Requirement already satisfied: dataclasses; python_version < "3.7" in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (0.7) Requirement already satisfied: packaging in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (20.4) Requirement already satisfied: sacremoses in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (0.0.43) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.6/dist-packages (from transformers==4.0.0rc1) (2019.12.20) Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->transformers==4.0.0rc1) (3.0.4) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->transformers==4.0.0rc1) (2020.6.20) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->transformers==4.0.0rc1) (1.24.3) Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->transformers==4.0.0rc1) (2.10) Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from packaging->transformers==4.0.0rc1) (1.15.0) Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/lib/python3.6/dist-packages (from packaging->transformers==4.0.0rc1) (2.4.7) Requirement already satisfied: click in /usr/local/lib/python3.6/dist-packages (from sacremoses->transformers==4.0.0rc1) (7.1.2) Requirement already satisfied: joblib in /usr/local/lib/python3.6/dist-packages (from sacremoses->transformers==4.0.0rc1) (0.17.0) Building wheels for collected packages: transformers Building wheel for transformers (PEP 517) ... done Created wheel for transformers: filename=transformers-4.0.0rc1-cp36-none-any.whl size=1349475 sha256=8f08b76fc03d4cd0c1532e37462b5f1682fc58ad7f92ed533533b276fc4ecaf5 Stored in directory: /tmp/pip-ephem-wheel-cache-8gbsru65/wheels/33/eb/3b/4bf5dd835e865e472d4fc0754f35ac0edb08fe852e8f21655f Successfully built transformers Requirement already satisfied: sentencepiece in /usr/local/lib/python3.6/dist-packages (0.1.94)

AttributeError Traceback (most recent call last)

in () 12 tokenizer = PegasusTokenizer.from_pretrained(model_name) 13 model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device) ---> 14 batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest').to(torch_device) 15 translated = model.generate(**batch) 16 tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True) 2 frames /usr/local/lib/python3.6/dist-packages/transformers/file_utils.py in wrapper(*args, **kwargs) 1236 def wrapper(*args, **kwargs): 1237 if is_torch_available(): -> 1238 return func(*args, **kwargs) 1239 else: 1240 raise ImportError(f"Method `{func.__name__}` requires PyTorch.") /usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils_base.py in to(self, device) 777 modification. 778 """ --> 779 self.data = {k: v.to(device) for k, v in self.data.items()} 780 return self 781 /usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils_base.py in (.0) 777 modification. 778 """ --> 779 self.data = {k: v.to(device) for k, v in self.data.items()} 780 return self 781 AttributeError: 'list' object has no attribute 'to' " Please help. Thanks, Akila
EliaKunz commented 3 years ago

@greenstars having the same issue - How did you resolve this?

greenstars commented 3 years ago

@greenstars having the same issue - How did you resolve this?

@EliaKunz I changed "!pip install git+https://github.com/huggingface/transformers.git" to "!pip install transformers".

EliaKunz commented 3 years ago

Thx! Was on datalore with the latest transformers 4 - downgraded to 3.5 and everything is working now.

massanishi commented 3 years ago

I had the same issue with the latest transformers 4.1 (pip installed). It's fixed after adding return_tensors point.

From

batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest').to(torch_device)

to

batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest', return_tensors='pt').to(torch_device)

did the job for me.

YatinKapoor commented 3 years ago

On running batch = tokenizer(src_text, truncation=True, padding='longest', return_tensors="pt").to(device) I am getting the error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-e6e55e18a32c> in <module>()
----> 1 batch = tokenizer(src_text, truncation=True, padding='longest', return_tensors="pt").to(device)
      2 translated = model.generate(**batch)
      3 tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)

TypeError: 'NoneType' object is not callable

and on running batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest', return_tensors='pt').to(torch_device) I am getting the error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-50-b7183fa2a37c> in <module>()
----> 1 batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest', return_tensors='pt').to(torch_device)
      2 translated = model.generate(**batch)
      3 tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)

AttributeError: 'NoneType' object has no attribute 'prepare_seq2seq_batch'

Any help would be greatly appreciated.

LysandreJik commented 3 years ago

@YatinKapoor your tokenizer seems to be None

maifeng commented 3 years ago

Need to replace PegasusTokenizer with AutoTokenizer

from transformers import PegasusForConditionalGeneration, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
smahm094 commented 2 years ago

@maifeng thanks! AutoTokenizer did the job for me!

fahnub commented 1 year ago

I had the same issue with the latest transformers 4.1 (pip installed). It's fixed after adding return_tensors point.

From

batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest').to(torch_device)

to

batch = tokenizer.prepare_seq2seq_batch(src_text, truncation=True, padding='longest', return_tensors='pt').to(torch_device)

did the job for me.

Worked for me