tricao / word2vec

Automatically exported from code.google.com/p/word2vec
Apache License 2.0
0 stars 0 forks source link

Patch for /trunk/word2vec.c #16

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Patch for bug, which caused discarding the last word of vocab after sorting if 
there was no newline character in the input file.

If there is no newline in the input file, vocab[0].cn==0, which is ignored in 
sorting, but is not in the for loop, where it decrements the vocab_size and 
frees the memory of the last word. However, it still computes the hash for the 
last word if its count is greater than min_count. Also the realloc needs to 
allocate only vocab_size * sizeof(struct vocab_word).

Original issue reported on code.google.com by FerroMrkva on 5 Feb 2014 at 11:24

Attachments: