Closed JinzhuLuo closed 3 years ago
Which version of sentence transformers have you installed?
Which version of sentence transformers have you installed?
sentence-transformers 0.3.6
Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases
Update to the most recent version. Then the examples will work
Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases
Update to the most recent version. Then the examples will work
Thank you! it solves my problem
I
Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases
Update to the most recent version. Then the examples will work
I find the error is because of batch size, let me explain
data=dataframe['title'].sample(5000).values.tolist()
data
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('distilbert-base-nli-mean-tokens')
embeddings = model.encode(data, show_progress_bar=True, batch_size=25)
if I keep the default batch size i.e. 32, it will give me float error, I think this should be fixed within the library
I
Version 0.4.1. had major changes on how to load data for training: https://github.com/UKPLab/sentence-transformers/releases Update to the most recent version. Then the examples will work
I find the error is because of batch size, let me explain
data=dataframe['title'].sample(5000).values.tolist() data from sentence_transformers import SentenceTransformer model = SentenceTransformer('distilbert-base-nli-mean-tokens') embeddings = model.encode(data, show_progress_bar=True, batch_size=25)
if I keep the default batch size i.e. 32, it will give me float error, I think this should be fixed within the library
It worked with batch_size = 25 thank you! It is weird that it gives error on some of the sample data and not all.
Still facing the issue. Is this fixed ?
Hi guys. I just try to follow the training_stsbenchmark_continue_training.py to build my own model. But my model and these examples both show the error. I did not make any changes to the training_stsbenchmark_continue_training.py file.Any idea to fix it? Thanks! the error mesaage : 2021-01-26 22:05:48 - Load SentenceTransformer from folder: /home/jinzhu/.cache/torch/sentence_transformers/sbert.net_models_bert-base-nli-mean-tokens 2021-01-26 22:05:50 - Use pytorch device: cuda 2021-01-26 22:05:50 - Read STSbenchmark train dataset 2021-01-26 22:05:50 - Read STSbenchmark dev dataset 2021-01-26 22:05:50 - Warmup-steps: 144 Epoch: 0%| | 0/4 [00:00<?, ?it/s] Iteration: 0%| | 0/360 [00:00<?, ?it/s] Epoch: 0%| | 0/4 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/jinzhu/PycharmProjects/Sentence/batch_test.py", line 85, in
output_path=model_save_path)
File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py", line 543, in fit
data = next(data_iterator)
File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/jinzhu/anaconda3/envs/deep/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py", line 365, in smart_batching_collate
num_texts = len(batch[0][0])
TypeError: 'InputExample' object is not subscriptable