Open VuceWillis opened 8 years ago
Put min_count = 1 , and try .
model = Word2Vec_Supervised(sentences, hs=0, workers=num_workers, size=num_features, min_count=min_word_count, window=context, sample=downsampling, label_dict=label_dict)
On Fri, Jun 10, 2016 at 2:07 PM, VuceWillis notifications@github.com wrote:
So, I was trying out an own implementation where I used the following label_dict: label_dict = {"good": ["pos_word"], "awesome": ["pos_word"], "great": ["pos_word"], "bad": ["neg_word"], "horrible": ["neg_word"], "terrible": ["neg_word"]}
Running: Set values for various parameters
num_features = 300 # Word vector dimensionality
min_word_count = 50 # Minimum word count
num_workers = 4 # Number of threads to run in parallel context = 8 # Context window size
downsampling = 1e-3 # Downsample setting for frequent words Initialize and train the model
model = Word2Vec_Supervised(sentences, hs=0, workers=num_workers, size=num_features, min_count=min_word_count, window=context, sample=downsampling, label_dict=label_dict)
Gave me the following KeyError: Exception in thread Thread-4: Traceback (most recent call last): File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 801, in bootstrap_inner self.run() File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 754, in run self.__target(_self.args, *_self.__kwargs) File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 529, in worker_train job_words = self._get_job_words(alpha, work, job, neu1) File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 488, in _get_job_words a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for sentence in job) File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 488, in a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for sentence in job) File "word2vec_inner_supervised.pyx", line 903, in word2vec_inner_supervised.train_sentence_sg_categ_nogil (./gensim/models/word2vec_inner_supervised.c:8797) KeyError: 'pos_word'
Any idea why this would happen?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/s4sarath/word2vec_supervised/issues/4, or mute the thread https://github.com/notifications/unsubscribe/AKJPKKUJRvMExwEdAifTEdi8GApjZN13ks5qKSJZgaJpZM4IyvHW .
That indeed solves the KeyError issue. However, it's printing "Inside the Cython categ Function" a lot now, which causes my notebook to (not crash) but almost crash (I can hardly interrupt the kernel). Maby an idea to leave that print statement out?
Sorry for the late reply . I will leave that print statement commented .
So, I was trying out an own implementation where I used the following label_dict: label_dict = {"good": ["pos_word"], "awesome": ["pos_word"], "great": ["pos_word"], "bad": ["neg_word"], "horrible": ["neg_word"], "terrible": ["neg_word"]}
Running:
Set values for various parameters
num_features = 300 # Word vector dimensionality
min_word_count = 50 # Minimum word count
num_workers = 4 # Number of threads to run in parallel context = 8 # Context window size
downsampling = 1e-3 # Downsample setting for frequent words
Initialize and train the model
model = Word2Vec_Supervised(sentences, hs=0, workers=num_workers, size=num_features, min_count=min_word_count, window=context, sample=downsampling, label_dict=label_dict)
Gave me the following KeyError: Exception in thread Thread-4: Traceback (most recent call last): File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 801, in bootstrap_inner self.run() File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 754, in run self.__target(_self.args, *_self.__kwargs) File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 529, in worker_train job_words = self._get_job_words(alpha, work, job, neu1) File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 488, in _get_job_words a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for sentence in job) File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 488, in
a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for sentence in job)
File "word2vec_inner_supervised.pyx", line 903, in word2vec_inner_supervised.train_sentence_sg_categ_nogil (./gensim/models/word2vec_inner_supervised.c:8797)
KeyError: 'pos_word'
Any idea why this would happen?