facebookresearch / InferSent

InferSent sentence embeddings
Other
2.28k stars 470 forks source link

What is the significance of infersent1.pkl file generated in the SentEval implementation here if the word embeddings from glove enter the model #77

Closed shivamakhauri04 closed 6 years ago

shivamakhauri04 commented 6 years ago

I read the paper and could not relate the role of infersent1.pkl being imported in the demo.ipynb .Can you help me understand the same.

Also, if I want my own word embeddings (created on a small dataset) to be given as input instead of glove, whats are the things i should take care of..?

Thanks and regards

aconneau commented 6 years ago

Hi,

the infersent1.pkl is the BiLSTM-max trained on AllNLI (the one which gives best results in the paper). It is trained with GloVe vectors so if you use it, you'll have to use these word embeddings. infersent2.pkl is trained with latest fastText-word2vec common-crawl embeddings. If you really want to use your own word embeddings, you can re-train the model using train_nli.py

Thanks, Alexis