stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.86k stars 1.51k forks source link

Skipping updateCaught NaN in diff for kdiff for thread. #64

Open dchaplinsky opened 7 years ago

dchaplinsky commented 7 years ago

Any ideas/suggestions on how to overcome it? Latest GloVe from Github, here is the params: time build/glove -save-file gigamega.cased.tokenized.glove.600d -threads 32 -input-file /mnt/cooccurrence.shuf.bin -x-max 40 -iter 50 -vector-size 600 -vocab-file /mnt/vocab.txt -verbose 3

ghost commented 7 years ago

Did you look at this:

https://github.com/stanfordnlp/GloVe/issues/50

kosloot commented 7 years ago

We too had a lot of NaN problems. Probably because an array was not totally initialized. I solved this in glove.c and issued a pull request. #68
(which also includes a fix for a out-of-bounds problem in cooccur.c)

Feel free to test it:)