Pre Trained model on XLNet (specially using for semantic Search)

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

14.83k stars 2.44k forks source link

Pre Trained model on XLNet (specially using for semantic Search) #210

Open shivprasad94 opened 4 years ago

shivprasad94 commented 4 years ago

Hello,

I am currently working on NLP: Semantic Search proof of concept.
as per our research, BERT and XLNet are better performing model for Semantic Search (any other suggestion are welcome)
there is no Pre-trained model on XLNet is this Repo, can someone help me out in fine-tuning this model for XLNet instead of default bert-base-nli-mean-tokens (from where I need to exactly start )
Pre Trained model on XLNet coming anytime soon??

nreimers commented 4 years ago

Hi @shivprasad94 When I was using XLNet like a year a ago, it did not yield any good results. The performance was usually below that of BERT and RoBERTa. Hence, I didn't upload a model.

I just re-started the XLNet training to see if it has changed with the latest version. But I would not necessarily expect that it outperforms BERT.

The conclusion "better language model / better supervised model" ===> "better sentence embedding model" sadly does not hold. For sentence embeddings used in unsupervised scenarios, each dimension is treated equally with the same weight. Some of the better BERT-like models are sadly way worse when producing sentence embeddings.

Best Nils Reimers

shivprasad94 commented 4 years ago

hi @nreimers

I was also looking at the Pre-trained model mentioned in the URL : https://huggingface.co/transformers/pretrained_models.html

I already see, 2 pre-trained models on XLnet, do you think it would be feasible to use it in case of a semantic search option?

embedder = SentenceTransformer('xlnet-base-cased') embedder = SentenceTransformer('xlnet-large-cased')

what's your take on initial research on XLnet for semantic Search?

nreimers commented 4 years ago

Models from huggingface have to be loaded with the models.Transformer() method, as shown here: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training_transformers/training_nli.py

As mentioned, in my experiments, XLNet does not produce good sentence embeddings. In general I have the impression that XLNet is not that robust, i.e., training is quite difficult and often breaks to unknown reasons. BERT and RoBERTa are in my opinion much better models, which are easier to train and do not break so often.

For supervised tasks, ALBERT and Electra also work quite well (according to experiences from colleagues). For sentence embeddings task, ALBERT and Electra sadly do not yield an improvement.