Closed Abhishek-Rnjn closed 6 years ago
could you please share the code you exactly try to run or commands - it is difficult to understand otherwise it and reproduce the error.
On 17 Jun 2018, at 11:10, Abhishek Ranjan notifications@github.com wrote:
I was training this model on 500words and 500 most similar word of each word. i.e. 250000 total words. I picked those words and their vectors randomly from a pre-trained word2vec file. And i was getting only one sense for each word. So does the training depends on context too?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tudarmstadt-lt/sensegram/issues/19, or mute the thread https://github.com/notifications/unsubscribe-auth/ABY6vspy-vq-y1nB7XOSepkRAjhu7QpAks5t9hz0gaJpZM4UqxAu.
I just used the command python train.py model/word_embedd.txt I wasn't getting any error . the code was running well but the results weren't satisfactory. word_embedd.txt contained randomly chosen 250000 words and their vectors, not generated from any corpus but chosen randomly from pre trained vector space. When I trained it on word embeddings generated from a corpus ,it gave very good results. So i wanted to if the sentences/context is also needed to get good results.
I think that the problem is that you did not follow the instructions here:
Sorry: they are a bit hidden, maybe I should make them more prominent. In fact, you need to call your embeddings model corpus.word_vectors, where corpus is a name of non-existent corpus file.
On Sun, Jun 17, 2018 at 11:41 AM Abhishek Ranjan notifications@github.com wrote:
I just used the command python train.py model/word_embedd.txt I wasn't getting any error . the code was running well but the results weren't satisfactory. word_embedd.txt contained randomly chosen 250000 words and their vectors, not generated from any corpus but chosen randomly from pre trained vector space. When I trained it on word embeddings generated from a corpus ,it gave very good results. So i wanted to if the sentences/context is also needed to get good results.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tudarmstadt-lt/sensegram/issues/19#issuecomment-397866971, or mute the thread https://github.com/notifications/unsubscribe-auth/ABY6vk-HUHD-V-5WF1svoQ-_jzbzc11fks5t9iQ-gaJpZM4UqxAu .
I was training this model on 500words and 500 most similar word of each word. i.e. 250000 total words. I picked those words and their vectors randomly from a pre-trained word2vec file. And i was getting only one sense for each word. So does the training depends on context of word too? Cause i was getting satisfactory results when training on a corpus.