Closed iknoorjobs closed 3 years ago
Hi @iknoorjobs
1) The performance mainly depends on the number of layers. I sadly don't know which base model they use (like DistilBERT or bert-base or bert-large), but this has the most impact on the performance. If you compare the same model type (distil, base, large), you should get the same inference time.
2) This indicates that the input is not truncated. Some old config version files on Huggingface do not specify the max length for the input. In that case, a 535 word piece text is passed to the model, but the model only supports inputs up to 512 word pieces.
When you install sentence-transformers from source, you can load the model like this:
model = CrossEncoder('model_name', max_length=512)
The max_length
parameter will be part of the next release (0.3.9)
Hi @nreimers
Many thanks for your response.
The previous error is gone but if I again try to load some nboost model for training cross-encoder. It shows the following error when evaluating the model on dev set.
Code for loading model:
model = CrossEncoder('nboost/pt-biobert-base-msmarco', max_length=512)
Error during training when evaluating on dev set:
2020-11-17 19:16:55 - CECorrelationEvaluator: Evaluating the model on sts-dev dataset after epoch 0:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-a97b57ce0904> in <module>
13
14 # Train the model
---> 15 model.fit(train_dataloader=train_dataloader,
16 evaluator=evaluator,
17 epochs=num_epochs,
~/sentence-transformers/sentence_transformers/cross_encoder/CrossEncoder.py in fit(self, train_dataloader, evaluator, epochs, loss_fct, acitvation_fct, scheduler, warmup_steps, optimizer_class, optimizer_params, weight_decay, evaluation_steps, output_path, save_best_model, max_grad_norm, use_amp, callback)
206
207 if evaluator is not None:
--> 208 self._eval_during_training(evaluator, output_path, save_best_model, epoch, -1, callback)
209
210
~/sentence-transformers/sentence_transformers/cross_encoder/CrossEncoder.py in _eval_during_training(self, evaluator, output_path, save_best_model, epoch, steps, callback)
278 """Runs evaluation during the training"""
279 if evaluator is not None:
--> 280 score = evaluator(self, output_path=output_path, epoch=epoch, steps=steps)
281 if callback is not None:
282 callback(score, epoch, steps)
~/sentence-transformers/sentence_transformers/cross_encoder/evaluation/CECorrelationEvaluator.py in __call__(self, model, output_path, epoch, steps)
43
44
---> 45 eval_pearson, _ = pearsonr(self.scores, pred_scores)
46 eval_spearman, _ = spearmanr(self.scores, pred_scores)
47
~/anaconda3/envs/sbert/lib/python3.8/site-packages/scipy/stats/stats.py in pearsonr(x, y)
3854 return dtype(np.sign(x[1] - x[0])*np.sign(y[1] - y[0])), 1.0
3855
-> 3856 xmean = x.mean(dtype=dtype)
3857 ymean = y.mean(dtype=dtype)
3858
~/anaconda3/envs/sbert/lib/python3.8/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
149 is_float16_result = True
150
--> 151 ret = umr_sum(arr, axis, dtype, out, keepdims)
152 if isinstance(ret, mu.ndarray):
153 ret = um.true_divide(
TypeError: No loop matching the specified signature and casting was found for ufunc add
Hi @iknoorjobs
nboost/pt-biobert-base-msmarco is outputting two scores, first score for "not_relevant", the second for "relevant". Sadly this is not compatible with CECorrelationEvaluator. CECorrelationEvaluator expects that the model outputs only a single score and compares the single score with the gold score using Spearman rank correlation.
Hi @nreimers Ah, ok. Thanks. I also checked the output of the model to see if it can be converted to single output but the output is very different and doesn't add to one.
Closing this now. Thanks.
Hi @nreimers
Is it possible to train the nboost or other passage reranking models (which gives two scores "not_relevant" and "relevant") using the latest cross-encoder scripts CERerankingEvaluator? I have a dataset in the format given below (score from 0 to 1) and I want to fine-tune these passage reranking models on this dataset.
["Query", "passage", score]
Many thanks.
Hi @iknoorjobs Yes, you can find an example here: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/ms_marco/train_cross-encoder.py
It is based on the MS Marco dataset, where you have relevant passages [query, passage, 1] and irrelevant passages [query, passage, 0].
The score could also be somewhere between 0 and 1
Hi @nreimers
Thank you for your response. Still, if I try to load the nboost model, it shows the following error.
model = CrossEncoder("nboost/pt-bert-base-uncased-msmarco", num_labels=1, max_length=512)
RuntimeError: Error(s) in loading state_dict for BertForSequenceClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([2, 768]) from checkpoint, the shape in current model is torch.Size([1, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([1]).
When I load this model by changing the num_labels to 2, then it works. But after training 1 epoch on the above data format ["Query", "passage", score]
), it shows error during evaluation as we only have a single label from the data. Does it make sense to convert the data for the format ( ["Query", "passage", 1-score, score]
because now we have 2 labels for "not_relevant" and "relevant"?
Thanks
Hi @iknoorjobs The Nboost models have the issue that they use 2 labels (relevant and not relevant). If you want to use this model as base, you need binary labels, i.e. int(0) and int(1) as labels.
I will release today improved cross-encoder models for MS Marco that 1) Are quicker than the nboost models 2) Achieve a better performance on MS Marco & TREC DL 2019 dataset 3) And only use a single output to indicate if query and passage are relevant
I will release today improved cross-encoder models for MS Marco that
@nreimers Fantastic news! Very much looking forward to the models today.
If you want to use this model as base, you need binary labels, i.e. int(0) and int(1) as labels.
Can your cross encoder training scripts be used to train the model if I have dataset with binary labels?
Many thanks
@iknoorjobs The models are now online: https://github.com/UKPLab/sentence-transformers/tree/master/examples/applications/information-retrieval
Can your cross encoder training scripts be used to train the model if I have dataset with binary labels? Yes. The MS Marco dataset had only binary labels (relevant or not relevant), which were encoded as 1 and 0.
It will also work if you have more fine-grained labels, like 0, 0.5, 0.8 and 1
@nreimers Thanks a lot. And I must say, your work is great. Also, looking forward to the paper.
Hi
I finetuned the cross encoders model using one of the huggingface model (link) on the sts dataset using your training script. I loaded the model using the command and it shows the following warning.
model = CrossEncoder('lordtt13/COVID-SciBERT', num_labels=1)
Now, when I use the model after training, 1) It is comparatively slow during inference time as compared to cross-encoder models provided by sentence-transformer. 2) It gives the following error for some longer input pairs.
RuntimeError: The size of tensor a (535) must match the size of tensor b (512) at non-singleton dimension 1
Could you please tell why is this happening or if I am missing something?
Many thanks Iknoor