Open aerinkim opened 6 years ago
I am not sure It is a memory error, I am getting the exact same error on a cluster with 3Tb of RAM at my disposal. However I do agree that any comment on how to solve this would be highly appreciated!
I hope that my comments could help. In my case, a small dataset (1.6 MB) has this problem, but a larger data (600MB) doesn't. So I think memory is not the problem, but somehow related to the content of the data, which will explain why cutting your data works. @byorxyz
Just download the latest version from github, problem will be solved. Do not download glove from stanfold glove homepage.
I have the same problem on my 150GB corpus, in running glove.c (I have downloaded glove from github). I have 60 GB Memory and the program does not allocate more than 3 GB.
Just download the latest version from github, problem will be solved. Do not download glove from stanfold glove homepage.
Still persists to me.
I spent a while debugging a segfault that I thought was this problem, but I'd actually passed a directory as the -vocab-file
argument (it was late...). So for future readers, double check your arguments more closely to make sure it's not a simple problem. I'll make a PR to catch the error-case I hit.
FWIW I spent some time in the glove.c
trying to find where something might be going wrong if the cooccurrences file was too large. My only guess was that there might be an integer overflow? I doubt it though.
If you do hit a segfault, debugging should be quite easy. Set -threads 1
and start adding print statements to the glove.c
file. I would suggest trying to print the index into the W
array on line 127, like this:
for (b = 0; b < vector_size; b++) {
fprintf(stderr,"Accessing W. b+l1=%lld, b+l2=%lld\n", b+l1, b+l2);
diff += W[b + l1] * W[b + l2]; // dot product of word and context word vector
}
It's a beautifully short program, so I'm sure if the failing line is identified it can be fixed easily.
I also had this problem, but I had an approach to solve it.
Analysising my problem:
Solution
Result Successfully built my own glove Vector
Good luck!
I'm trying to train Glove on a pretty big dataset, the newest wikidump (22G txt file). The total # of vocab that I'm training is 1.7 mil. Every file (shuffle, cooccur, vocab_count) until glove runs smoothly without any memory error. (My RAM = 64G)
However, when I ran glove, I'm getting "Segmentation fault (core dumped)".
I tried with different # of threads as well: 1,2,4,8,16,32, etc. Nothing runs. Can someone please point me where to look? Thanks for this repository!
Update
I cut the number of vocabulary from 1.7 million to 1 million and glove.c runs without "segmentation fault" error. So it is a memory error. But I would love to learn how to resolve this error and be able to train a model on the larger dataset! Any comment will be highly valued. Thanks.