UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.43k stars 2.5k forks source link

Using pretrained SBERT model in cross-encoder #726

Open datistiquo opened 3 years ago

datistiquo commented 3 years ago

Hey,

it seems straight forward possible that I can use a some of the pretrained models (SBERT) with the cross-encoders? It seems that they all have a BertForSequenceClassificatoin model available which can be used in the cross-encoder?

So I jst tried out of curiosity:

model = CrossEncoder('T-Systems-onsite/cross-en-de-roberta-sentence-transformer', num_labels=1, max_length=60)
....
model.fit(train_dataloader=train_dataloader,
          evaluator=seq_evaluator,
          epochs=num_epochs,
         # scheduler="constantlr",
          # optimizer_class = torch.optim.Adam,
          #optimizer_params = {'lr': 3e-5}, #, 'eps': 1e-6, 'correct_bias': False},
          #evaluation_steps=10,
          warmup_steps=warmup_steps,
          #output_path=model_save_path
          )

and it seems working and also some decent results without finetuning?

So would it be a good idea to fientune a SBERT model on a cross-encoder task?

nreimers commented 3 years ago

Yes, the SBERT models are regular transformers model and hence can be used as base for cross encoders.

Sometimes it could be helpful, otherwise it is better to use the original models. There is no clear consistency, sometimes it helps, sometimes not

datistiquo commented 3 years ago

@nreimers Thank you!

Is above syntax right for the purpose of using fientuned sbert model in cross-encoder? I saw that it then downloads from huggingfacee the proper BertForSequenceClassification model. Instead of a simple bert model 8used for Sentence Transformer?). So it seems that anyway for any sbert model on huggingface has "automatically" a corresponding cross-encoder (BertForSequenceClassification ) model?

nreimers commented 3 years ago

Correct