UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.73k stars 2.43k forks source link

Question about TypeError: 'InputExample' object is not subscriptable #718

Closed JinzhuLuo closed 3 years ago

JinzhuLuo commented 3 years ago

Hi guys. I just try to follow the training_stsbenchmark_continue_training.py to build my own model. But my model and these examples both show the error. I did not make any changes to the training_stsbenchmark_continue_training.py file.Any idea to fix it? Thanks! the error mesaage : 2021-01-26 22:05:48 - Load SentenceTransformer from folder: /home/jinzhu/.cache/torch/sentence_transformers/sbert.net_models_bert-base-nli-mean-tokens 2021-01-26 22:05:50 - Use pytorch device: cuda 2021-01-26 22:05:50 - Read STSbenchmark train dataset 2021-01-26 22:05:50 - Read STSbenchmark dev dataset 2021-01-26 22:05:50 - Warmup-steps: 144 Epoch: 0%| | 0/4 [00:00<?, ?it/s] Iteration: 0%| | 0/360 [00:00<?, ?it/s] Epoch: 0%| | 0/4 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/jinzhu/PycharmProjects/Sentence/batch_test.py", line 85, in output_path=model_save_path) File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py", line 543, in fit data = next(data_iterator) File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py", line 365, in smart_batching_collate num_texts = len(batch[0][0]) TypeError: 'InputExample' object is not subscriptable

nreimers commented 3 years ago

Which version of sentence transformers have you installed?

JinzhuLuo commented 3 years ago

Which version of sentence transformers have you installed?

sentence-transformers 0.3.6

nreimers commented 3 years ago

Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases

Update to the most recent version. Then the examples will work

JinzhuLuo commented 3 years ago

Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases

Update to the most recent version. Then the examples will work

Thank you! it solves my problem

shainaraza commented 2 years ago

I

Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases

Update to the most recent version. Then the examples will work

I find the error is because of batch size, let me explain

data=dataframe['title'].sample(5000).values.tolist()
data

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('distilbert-base-nli-mean-tokens')
embeddings = model.encode(data, show_progress_bar=True, batch_size=25)

if I keep the default batch size i.e. 32, it will give me float error, I think this should be fixed within the library

rhajou commented 1 year ago

I

Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases Update to the most recent version. Then the examples will work

I find the error is because of batch size, let me explain

data=dataframe['title'].sample(5000).values.tolist()
data

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('distilbert-base-nli-mean-tokens')
embeddings = model.encode(data, show_progress_bar=True, batch_size=25)

if I keep the default batch size i.e. 32, it will give me float error, I think this should be fixed within the library

It worked with batch_size = 25 thank you! It is weird that it gives error on some of the sample data and not all.

vjagannath786 commented 1 year ago

Still facing the issue. Is this fixed ?