facebookresearch / InferSent

InferSent sentence embeddings
Other
2.28k stars 471 forks source link

plan to support fasttext? #83

Closed zarzen closed 6 years ago

zarzen commented 6 years ago

Hi, Awesome project! Thanks for the high quality of sentence encoding model! I have read discussions at #46 and #10 , because both of them are closed, I start this one. Will you plan to support fasttext later? If not what is the difficulty now?

aconneau commented 6 years ago

Hi, thanks for the kind words! So now infersent2.pkl has been trained with the latest fastText common-crawl word embeddings (and infersent1.pkl is trained with GloVe). Please pull the latest version of InferSent if you haven't done so recently. Note however that these fastText common-crawl embeddings have not been trained with character n-grams so it's not possible to handle OOV (though there is a vocabulary of 2M words..). See https://github.com/facebookresearch/fastText/issues/428#issuecomment-365046063 for more details on this. Best, Alexis