Negative Nan cost for every iteration

Hi. After processing a large (30G) input file, I tried training Glove on it but I ran into a few problems. Every epoch outputs -nan as the cost. I've pasted all the relevant output below:

(train-file.sh is just a modified version of demo.sh using my input file instead of text8)

ubuntu@ip-172-30-4-85:~/GloVe$ ./train-file.sh 
$ build/vocab_count -min-count 20 -verbose 2 < Full-File-Tokenized-Single-Line-No-unks.txt > vocab.txt
BUILDING VOCABULARY
Processed 6976519851 tokens.
Counted 36371606 unique words.
Truncating vocabulary at min count 20.
Using vocabulary of size 1817927.

$ build/cooccur -memory 6.0 -vocab-file vocab.txt -verbose 2 -window-size 15 < Full-File-Tokenized-Single-Line-No-unks.txt > cooccurrence.bin
COUNTING COOCCURRENCES
window size: 15
context: symmetric
max product: 20163704
overflow length: 57042534
Reading vocab from file "vocab.txt"...loaded 1817927 words.
Building lookup table...table contains 260465759 elements.
Processed 6976514733 tokens.
Writing cooccurrences to disk...........116 files in total.
Merging cooccurrence files: processed 1209229520 lines.

$ build/shuffle -memory 6.0 -verbose 2 < cooccurrence.bin > cooccurrence.shuf.bin
SHUFFLING COOCCURRENCES
array size: 382520524
Shuffling by chunks: processed 0 lines.
Wrote 1 temporary file(s).
Merging temp files: processed 0 lines.

$ build/glove -save-file vectors -threads 8 -input-file cooccurrence.shuf.bin -x-max 10 -iter 15 -vector-size 50 -binary 2 -vocab-file vocab.txt -verbose 2
TRAINING MODEL
Read 0 lines.
Initializing parameters...done.
vector size: 50
vocab size: 1817927
x_max: 10.000000
alpha: 0.750000
06/10/17 - 12:16.17AM, iter: 001, cost: -nan
06/10/17 - 12:16.17AM, iter: 002, cost: -nan
06/10/17 - 12:16.17AM, iter: 003, cost: -nan
06/10/17 - 12:16.17AM, iter: 004, cost: -nan
06/10/17 - 12:16.17AM, iter: 005, cost: -nan
06/10/17 - 12:16.17AM, iter: 006, cost: -nan
06/10/17 - 12:16.17AM, iter: 007, cost: -nan
06/10/17 - 12:16.17AM, iter: 008, cost: -nan
06/10/17 - 12:16.17AM, iter: 009, cost: -nan
06/10/17 - 12:16.17AM, iter: 010, cost: -nan
06/10/17 - 12:16.17AM, iter: 011, cost: -nan
06/10/17 - 12:16.17AM, iter: 012, cost: -nan
06/10/17 - 12:16.17AM, iter: 013, cost: -nan
06/10/17 - 12:16.17AM, iter: 014, cost: -nan
06/10/17 - 12:16.17AM, iter: 015, cost: -nan

Things I've noticed:

after calling shuffle, 0 lines processed:

SHUFFLING COOCCURRENCES array size: 382520524 Shuffling by chunks: processed 0 lines.
after calling glove, again 0 lines read:

TRAINING MODEL Read 0 lines

Other info:

vocab.txt looks good, there are 1817927 lines (27 MB)
vectors.txt look good; it's not empty; there is a vector entry for every token, and no nans that I could find (845 MB)
vectors.bin looks good too, it's a 1.4 GB file.
Both cooccurrence.bin and coocurrence.shuf.bin are zero-length files.

Things I've tried

setting x-max to 100.0 instead of the default value of 10.0. It didn't help. I got the same behaviour as above.
setting min-count to 50 to reduce the size of vectors. It didn't help. I got the same behaviour as above.

Any thoughts about this?

could it be related to the input file size (30GB)? Surely GloVe has been used with large datasets before?

stanfordnlp / GloVe

Negative Nan cost for every iteration #85

Any thoughts about this?