UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.46k stars 2.5k forks source link

Best way to do Domain Adaptation #804

Open lematmat opened 3 years ago

lematmat commented 3 years ago

Hi,

As I don't have any labeled dataset, I'm wondering what is the best way to adapt NLI and Quora to my domain application (Legal Law) :

nreimers commented 3 years ago

If you have sufficient number of legal documents, you can continue to pre-train BERT on it using Masked Language Model.

Then, you can fine-tune this model on the labeled data you have (or use e.g. Quora data if your task is similar).

lematmat commented 3 years ago

Ok, thank you very much nreimers.