jmexe / word2vec_training

word2vec training model
0 stars 0 forks source link

Error Can't import word2vec #1

Open Rahulvks opened 8 years ago

Rahulvks commented 8 years ago

Cant import word2vec in python.Jow to correct this error.

RuntimeError: you must first build vocabulary before training the model

Traceback (most recent call last): File "<pyshell#0>", line 1, in import word2vec File "word2vec.py", line 14, in model = word2vec.Word2Vec(sentences, size=100, window=4, min_count=1, workers=4) File "/usr/local/lib/python2.7/dist-packages/gensim-0.12.3-py2.7-linux-x86_64.egg/gensim/models/word2vec.py", line 432, in init self.train(sentences) File "/usr/local/lib/python2.7/dist-packages/gensim-0.12.3-py2.7-linux-x86_64.egg/gensim/models/word2vec.py", line 690, in train raise RuntimeError("you must first build vocabulary before training the model") RuntimeError: you must first build vocabulary before training the model

jmexe commented 8 years ago

It seems if you want to load a pre-trained model in c, you cannot continue trained the model

Rahulvks commented 8 years ago

can you tell me how do that ?. Am beginner in NLP

jmexe commented 8 years ago

You can either train the model using your own text or using some public dataset, like this: http://mattmahoney.net/dc/enwik9.zip Remember to pre-process the data before training : http://mattmahoney.net/dc/textdata.html I have added a sample to my code, hope that could help you.

Rahulvks commented 8 years ago

Thank you so much Jmexe. Sample code in github ?