Open weka511 opened 1 year ago
I've found some junk in vocabulary file (among the real words): < 682787 post 745958
730051 26 4060 .... 163880
I have confirmed that there really are 1,209,358 words in vocabulary be reading it back
Saved vocabulary of 1209335 words to ./data\blogs.npz Elapsed Time 200 m 16.08 s
It claims to have stored 1,209,358 words from blogs.zip. It takes 206 minutes, and occupies 38M of disk.