huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.73k stars 27.17k forks source link

AttributeError: 'torch.Size' object has no attribute 'as_list' #9914

Closed hiteshsom closed 3 years ago

hiteshsom commented 3 years ago

Hello,

I ran the follownig official example script from longformerforquestionanswering

# Tokenizer
tokenizer = LongformerTokenizer.from_pretrained('allenai/longformer-large-4096-finetuned-triviaqa')

# Model
model = TFLongformerForQuestionAnswering.from_pretrained('allenai/longformer-large-4096-finetuned-triviaqa')

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
encoding = tokenizer(question, text, return_tensors="pt")
input_ids = encoding["input_ids"]

# default is local attention everywhere
# the forward method will automatically set global attention on question tokens
attention_mask = encoding["attention_mask"]

outputs = model(input_ids, attention_mask=attention_mask)
start_logits = outputs.start_logits
end_logits = outputs.end_logits
all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())

answer_tokens = all_tokens[torch.argmax(start_logits) :torch.argmax(end_logits)+1]
answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens)) # remove space prepending space token

But got following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-4bf253125151> in <module>
      7 attention_mask = encoding["attention_mask"]
      8 
----> 9 outputs = model(input_ids, attention_mask=attention_mask)
     10 start_logits = outputs.start_logits
     11 end_logits = outputs.end_logits

~\Documents\env\lib\site-packages\tensorflow\python\keras\engine\base_layer.py in __call__(self, *args, **kwargs)
    983 
    984         with ops.enable_auto_cast_variables(self._compute_dtype_object):
--> 985           outputs = call_fn(inputs, *args, **kwargs)
    986 
    987         if self._activity_regularizer:

~\Documents\env\lib\site-packages\transformers\modeling_tf_longformer.py in call(self, inputs, attention_mask, global_attention_mask, token_type_ids, position_ids, inputs_embeds, output_attentions, output_hidden_states, return_dict, start_positions, end_positions, training)
   1492                 # put global attention on all tokens until `config.sep_token_id` is reached
   1493                 sep_token_indices = tf.where(input_ids == self.config.sep_token_id)
-> 1494                 global_attention_mask = _compute_global_attention_mask(shape_list(input_ids), sep_token_indices)
   1495 
   1496         outputs = self.longformer(

~\Documents\env\lib\site-packages\transformers\modeling_tf_utils.py in shape_list(x)
    924         :obj:`List[int]`: The shape of the tensor as a list.
    925     """
--> 926     static = x.shape.as_list()
    927     dynamic = tf.shape(x)
    928     return [dynamic[i] if s is None else s for i, s in enumerate(static)]

AttributeError: 'torch.Size' object has no attribute 'as_list'
github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.

If you think this still needs to be addressed please comment on this thread.

brienna commented 3 years ago

I have the same question.

LysandreJik commented 3 years ago

Hello! You're using TensorFlow models (see the TF prefix) but you're asking the tokenizer to return PyTorch tensors. You should either stick to full PyTorch (remove the TF prefix) or full TF (ask the tokenizer to return tf values)

HoaNguyen55 commented 2 years ago

I met the same issue, I did not know how to fix it

tensor([[    0, 24948,  5357,    88,    14,   397,  1176,  6724,     7, 35297,
         18109,  5814,    16,    43,   167,  4446, 37361,   381,     2,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1,
             1,     1,     1,     1,     1,     1,     1,     1,     1,     1]],
       device='cuda:0')
tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
         0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]],
       device='cuda:0')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-13-1309f9063eea>](https://localhost:8080/#) in <module>
      1 _, tokenizer = load_pho_bert()
----> 2 infer('Cảm ơn bạn đã chạy thử model của mình. Chúc một ngày tốt lành nha!', tokenizer)

2 frames
[/usr/local/lib/python3.7/dist-packages/keras/engine/input_spec.py](https://localhost:8080/#) in display_shape(shape)
    269 
    270 def display_shape(shape):
--> 271   return str(tuple(shape.as_list()))
    272 
    273 

AttributeError: 'torch.Size' object has no attribute 'as_list'
HoaNguyen55 commented 2 years ago

Hello! You're using TensorFlow models (see the TF prefix) but you're asking the tokenizer to return PyTorch tensors. You should either stick to full PyTorch (remove the TF prefix) or full TF (ask the tokenizer to return tf values)

Please help me how to fix this problem? How can I change my code? def infer(text, tokenizer, max_len=120): device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') print(device) class_names = ['thế giới', 'thể thao', 'văn hóa', 'vi tính']

model =  tf.keras.models.load_model('./models/cnn_nlp_text_classification_4_classer.h5')

encoded_review = tokenizer.encode_plus(
    text,
    max_length=max_len,
    truncation=True,
    add_special_tokens=True,
    padding='max_length',
    return_attention_mask=True,
    return_token_type_ids=False,
    return_tensors='pt',
)

input_ids = encoded_review['input_ids'].to(device)
print(input_ids.shape)
attention_mask = encoded_review['attention_mask'].to(device)
print(attention_mask.shape)

output = model(input_ids, attention_mask)
==> error happen here
demiurg03 commented 1 week ago

Hello, did you manage to solve this error and if so, how. I ran into the same error.