Kyubyong / wordvectors

Pre-trained word vectors of 30+ languages
MIT License
2.21k stars 393 forks source link

Details on word2vec model #10

Open PhilKuhnke opened 6 years ago

PhilKuhnke commented 6 years ago

Dear Kyubyong, great work - thank you very much for proving these word vectors! One question: Which model did you use to train your word vectors with word2vec? Skip-gram or cbow? Is this the standard model as reported in Mikolov et al. (2013) or a modified variant? And which parameters did you use to train the model for each language? Always the default parameters in make_wordvectors.sh?

Pzoom522 commented 5 years ago

Given make_wordvectors.sh and make_wordvectors.py, it seems that @Kyubyong used the gensim implementation of word2vec. Thus, by default, I believe he chose CBOW model. see gensim doc