pdrm83 / sent2vec

How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
MIT License
132 stars 12 forks source link

RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.IntTensor instead (while checking arguments for embedding) #1

Open lukemao opened 3 years ago

lukemao commented 3 years ago

Thanks for developing sent2vec

I just installed it.

I was trying to run the example and the getting the errors:

vectors = vectorizer.sent2vec_bert()
HBox(children=(FloatProgress(value=0.0, description='Downloading', max=231508.0, style=ProgressStyle(descripti…

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=442.0, style=ProgressStyle(description_…

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=267967963.0, style=ProgressStyle(descri…

Traceback (most recent call last):

  File "<ipython-input-7-d7f140e4a0b3>", line 1, in <module>
    vectors = vectorizer.sent2vec_bert()

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\sent2vec\vectorizer.py", line 34, in sent2vec_bert
    last_hidden_states = model(input_ids)

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\transformers\modeling_distilbert.py", line 463, in forward
    inputs_embeds = self.embeddings(input_ids)  # (bs, seq_length, dim)

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\transformers\modeling_distilbert.py", line 93, in forward
    word_embeddings = self.word_embeddings(input_ids)  # (bs, max_seq_length, dim)

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\torch\nn\modules\sparse.py", line 114, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)

  File "C:\Users\User\anaconda3\envs\py376\lib\site-packages\torch\nn\functional.py", line 1484, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)

RuntimeError: Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.IntTensor instead (while checking arguments for embedding)
pdrm83 commented 3 years ago

Can you specify your environment? CPU or GPU, Python 3 or 2, Windows or Linux? That helps me to replicate it to somewhat.

lukemao commented 3 years ago

Hope this helps:

The code runs on CPU

sklearn.show_versions()

System:
    python: 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\User\anaconda3\envs\py376\pythonw.exe
   machine: Windows-10-10.0.19041-SP0

Python dependencies:
       pip: 20.0.2
setuptools: 45.2.0.post20200210
   sklearn: 0.22.1
     numpy: 1.18.1
     scipy: 1.4.1
    Cython: 0.29.14
    pandas: 1.0.1
matplotlib: 3.1.3
    joblib: 0.14.1

Built with OpenMP: True