Closed spate141 closed 7 years ago
the main difference is that our vectors are trained such that they can be added over variable length sentences/documents. this should make a large difference. we hope to have more comparisons soon. (the approach could potentially even be combined with subword vectors as you mentioned). in the meantime, feel free to run some benchmark directly, for example sentEval, or STS 2017.
Hi, The fastText sentence embeddings are very much similar to CBOW and Skipgram averaged-embeddings in performance as the word vectors they learn are on a CBOW-ish objective (while exploiting the use of morphology) while our word embeddings are trained for the purpose of getting proper sentence embeddings rather than the word embeddings themselves. As you can see in the paper, our embeddings significantly outperform the CBOW and Skipgram sentence embeddings.
I was looking at the paper and code. Great work first of all! I have a question in mind, as new fastText library can generate sentence embeddings by averaging word vectors, is there any comparison between fastText and sent2vec in any supervised/unsupervised task?