UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.85k stars 2.44k forks source link

crossencoder with one sentence input for text classification #1272

Open ironllamagirl opened 2 years ago

ironllamagirl commented 2 years ago

I have a text classification problem where I need to classify text into one of 4 categories. I would like to use sbert but read that crossencoder only takes pair input.

How do I go about doing this? is there an example code? Thank you so much

timpal0l commented 2 years ago

Since this is a classification problem rather than a semantic search problem, you could just use a pretrained bert model as a bi-encoder.

nreimers commented 2 years ago

You can use the CrossEncoder also for individual texts. So you pass a list with individual sentences instead of sentence pairs

ironllamagirl commented 2 years ago

@nreimers Oh I did not know that. I will try it out Thanks!!

ddofer commented 2 years ago

I'll note that the examples and documentation are not clear that one can use just a single input sentence. (Also, i'm getting a tokenizer/padding error when using the CE + softmax evaluator, but that's a seperate issue)