Open ajinkya2903 opened 3 years ago
+1 Same issue/error, when calling the evaluator, on a dataset with a single text sentence.
cemodel = CrossEncoder(my_pretrained_sentenceTransformerModel_path, num_labels=1, device="cuda")
evaluator = CESoftmaxAccuracyEvaluator(sentence_pairs=[x[0] for x in X_test],labels = y_test,write_csv=True)
cemodel.fit(train_dataloader=train_dataloader,
evaluator=evaluator,
epochs=num_epochs,
warmup_steps=warmup_steps,
output_path=model_save_path,
show_progress_bar=True,
use_amp=True,)
TypeError Traceback (most recent call last)
<ipython-input-42-993535beba66> in <module>()
----> 1 evaluator(cemodel)
5 frames
/usr/local/lib/python3.7/dist-packages/sentence_transformers/cross_encoder/CrossEncoder.py in smart_batching_collate_text_only(self, batch)
93 texts[idx].append(text.strip())
94
---> 95 tokenized = self.tokenizer(*texts, padding=True, truncation='longest_first', return_tensors="pt", max_length=self.max_length)
96
97 for name in tokenized:
TypeError: __call__() got multiple values for argument 'padding'
EDIT: I can confirm that the same issue happens when loading a "Default" sentneceTransformer model;
cemodel = CrossEncoder("sentence-transformers/all-MiniLM-L6-v2", num_labels=1, device="cuda")
sentence_pairs must be a list of lists with the sentences, i.e. the following should fix it:
evaluator = CESoftmaxAccuracyEvaluator(sentence_pairs=[ [x[0]] for x in X_test],labels = y_test,write_csv=True)
@nreimers While training of cross encoder I am getting this error. Training is completed and while evaluation starts this error pops up. What is the solution to this?