teaglin / word2vec

Automatically exported from code.google.com/p/word2vec
Apache License 2.0
0 stars 0 forks source link

Patch for word2vec.c: Fix memory leaks #2

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
word2vec does not free allocated objects correctly. It also reads freed objects.
Attached patch fixes this issue.

Original issue reported on code.google.com by tetsuo.s...@gmail.com on 17 Aug 2013 at 6:23

Attachments:

GoogleCodeExporter commented 9 years ago
FYI I had to roll back this patch because when using demo-word.sh it was 
producing a vector.bin file that was unreadable by the distance utility.

Possibly related (but I doubt it): this was after also changing include headers 
for Mac OS X compilation (see 
https://github.com/dav/word2vec/commit/94bc3d7860670053d4785a5aa07737235ae4e272)

btw, I think you meant DestroyVocab not DestoryVocab

Original comment by dav.yagi...@gmail.com on 17 Aug 2013 at 8:01

GoogleCodeExporter commented 9 years ago
Thanks dav.yagi...@gmail.com for comments and typos. The fixed patch attached.

There was a bug in SortVocab in which the special word "</s>" was freed.
On Ubuntu 12.04.2, demo-word.sh works fine. I think it works fine on OS X as 
well.

I checked your commits on the github repository, and I think the changes are 
not relate to this issue. It is related to compilation issues on OS X because 
of missing header "malloc.h".

Original comment by tetsuo.s...@gmail.com on 17 Aug 2013 at 10:37

Attachments:

GoogleCodeExporter commented 9 years ago
New patch seems to work fine, thanks.

Original comment by dav.yagi...@gmail.com on 18 Aug 2013 at 6:45

GoogleCodeExporter commented 9 years ago
Where is it reading a freed memory?
The added error messages assume the error, e.g. no such file or directory.
Is it worth the bother of freeing a lot of this memory when the OS will do it 
much more efficiently when the program exits soon after?

Original comment by ralph.co...@gmail.com on 23 Aug 2013 at 9:30