pujols / zero-shot-learning

66 stars 23 forks source link

Word2vec embeddings #2

Open pbordes opened 5 years ago

pbordes commented 5 years ago

Hello,

Could you share the whole Word2vec model (size 500) that you trained? (not just for the 21K synsets but for all the Wikipedia vocabulary) It's for a research project. Thank you very much !

pujols commented 5 years ago

Hi,

I'm sorry but I'm not sure if we still keep it. Also, it is quite large (>40 GB if I recall correctly) so it will be hard to share. You can actually build upon the word2vec code, and in the corpus make every multiple-word terms in the synset into single-word terms (e.g., giant panda --> giant-panda) and then train the word2vec.