hmc-whisk / jsLDA

A React based version of jsLDA with brand new features added on
Other
0 stars 0 forks source link

Strings of length < 3 are automatically coded as Stopwords #181

Open theobayard opened 3 years ago

theobayard commented 3 years ago

With the implementation of custom regex does this makes sense?

Specifically, in LDAModel the following code appears: if (word.length <= 2) { this.stopwords[word] = 1; }

xandaschofield commented 3 years ago

Yikes -- no, we should delete that. Our default regular expression should be written to be >2 for length I think.

xandaschofield commented 3 years ago

src/core/LDAModel.ts, lines 406-408

Proper fix is probably to ensure default regular expression supports 3+ characters