stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.86k stars 1.51k forks source link

Length check in write_chunk(). #30

Closed DavidNemeskey closed 8 years ago

DavidNemeskey commented 8 years ago

cooccur introduces all-0 record(s) if one of the chunks it tries to write is empty; this can happen if the vocabulary is small (17 in my case). This PR fixes that.

Note that I haven't checked if a small vocabulary introduces other errors.

ghost commented 8 years ago

Okay, looks great. Tested it and seems to work well. Thank you for your contribution!