stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.81k stars 1.51k forks source link

Overflow in "overflow_threshold" (src/cooccur.c) when using 256G of memory #215

Closed florath closed 1 year ago

florath commented 1 year ago

There is an overflow in the src/cooccur.c file: when using 256G of memory, the 'int' of the 'overflow_threshold' (line 294) runs into an overflow which results in some strange actions (starting to create thousands of files).

./build/cooccur -memory 128.0 -vocab-file [...]
[....]
overflow_threshold [1216907370]

but

./build/cooccur -memory 256.0 -vocab-file [...]
[....]
overflow_threshold [-1861152525]