UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.38k stars 2.4k forks source link

AttributeError: 'Tensor' object has no attribute 'ndim' #316

Open thehayat opened 3 years ago

thehayat commented 3 years ago

I was trying the following example but I get the following error with transformers version =3.0.2


AttributeError                            Traceback (most recent call last)
d:\developement\pythonista\python learning\datagen\py368\py368\lib\site-packages\transformers\tokenization_utils_base.py in convert_to_tensors(self, tensor_type, prepend_batch_axis)
    506                 # at-least2d
--> 507                 if tensor.ndim > 2:
    508                     tensor = tensor.squeeze(0)

AttributeError: 'Tensor' object has no attribute 'ndim'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-6-eec25ef89f3d> in <module>
      1 sentences = ['This framework generates embeddings for each input sentence']
----> 2 sentence_embeddings = model.encode(sentences)

d:\developement\pythonista\python learning\datagen\py368\py368\lib\site-packages\sentence_transformers\SentenceTransformer.py in encode(self, sentences, batch_size, show_progress_bar, output_value, convert_to_numpy, is_pretokenized)
    145             features = {}
    146             for text in batch_tokens:
--> 147                 sentence_features = self.get_sentence_features(text, longest_seq)
    148 
    149                 for feature_name in sentence_features:

d:\developement\pythonista\python learning\datagen\py368\py368\lib\site-packages\sentence_transformers\SentenceTransformer.py in get_sentence_features(self, *features)
    185 
    186     def get_sentence_features(self, *features):
--> 187         return self._first_module().get_sentence_features(*features)
    188 
    189     def get_sentence_embedding_dimension(self):

d:\developement\pythonista\python learning\datagen\py368\py368\lib\site-packages\sentence_transformers\models\BERT.py in get_sentence_features(self, tokens, pad_seq_length)
     62         pad_seq_length = min(pad_seq_length, self.max_seq_length) + 2  ##Add Space for CLS + SEP token
     63 
---> 64         return self.tokenizer.prepare_for_model(tokens, max_length=pad_seq_length, pad_to_max_length=True, return_tensors='pt', truncation=True)
     65 
     66 

d:\developement\pythonista\python learning\datagen\py368\py368\lib\site-packages\transformers\tokenization_utils_base.py in prepare_for_model(self, ids, pair_ids, add_special_tokens, padding, truncation, max_length, stride, pad_to_multiple_of, return_tensors, return_token_type_ids, return_attention_mask, return_overflowing_tokens, return_special_tokens_mask, return_offsets_mapping, return_length, verbose, prepend_batch_axis, **kwargs)
   2096 
   2097         batch_outputs = BatchEncoding(
-> 2098             encoded_inputs, tensor_type=return_tensors, prepend_batch_axis=prepend_batch_axis
   2099         )
   2100 

d:\developement\pythonista\python learning\datagen\py368\py368\lib\site-packages\transformers\tokenization_utils_base.py in __init__(self, data, encoding, tensor_type, prepend_batch_axis)
    157         self._encodings = encoding
    158 
--> 159         self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis)
    160 
    161     @property

d:\developement\pythonista\python learning\datagen\py368\py368\lib\site-packages\transformers\tokenization_utils_base.py in convert_to_tensors(self, tensor_type, prepend_batch_axis)
    513             except:  # noqa E722
    514                 raise ValueError(
--> 515                     "Unable to create tensor, you should probably activate truncation and/or padding "
    516                     "with 'padding=True' 'truncation=True' to have batched tensors with the same length."
    517                 )

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length.```
yhongyu commented 3 years ago

me too, i want to know how to solve it

nreimers commented 3 years ago

What are your versions of pytorch, transformers and sentence-transformers?

It might be due to an old, outdated pytorch version.

Edit: See: https://github.com/huggingface/transformers/issues/5593

It appears to be an issue in transformers with an outdated pytorch version. Easiest solution is to update pytorch.

thehayat commented 3 years ago

For me. torch = 1.1.0 sentence-transformers=0.3.0 transformers=3.0.2

I believe versions are fine as per the document of this library. Also error is coming from transformers do you still suggest updating pytorch?

nreimers commented 3 years ago

Hi @thehayat torch 1.1.0 is quite outdated. Transformers tested the example with torch 1.3.1+.

I recommend to update to update to a more recent / the most recent version of torch.

thehayat commented 3 years ago

Hi @thehayat torch 1.1.0 is quite outdated. Transformers tested the example with torch 1.3.1+.

I recommend to update to update to a more recent / the most recent version of torch.

Hello, Will you be able to provide the link to download torch 1.3.1+ whl file directly to local. I couldn't find one.

nreimers commented 3 years ago

Hi @thehayat Just follow the installation guidelines on the pytorch website: https://pytorch.org/get-started/locally/

I can recommend to use the most recent version of pytorch.

parthplc commented 3 years ago

I was able to solve it using 1.6.0 stable version.

pip install torch == 1.6.0
Jess0-0 commented 3 years ago

I was trying with the newest version (1.6.0) of PyTorch but still got the error AttributeError: 'Tensor' object has no attribute 'ndim'