commonsense / conceptnet-numberbatch

Other
1.28k stars 143 forks source link

Common word subset #47

Closed ncammarata closed 6 years ago

ncammarata commented 6 years ago

Is it possible to download a subset of Numberbatch sorted by common words? In my application, it is computationally infeasible to load all of the words into memory.

However, a 20% subset of the most common words would solve my problems and fit into memory as well.

Please let me know if this is possible! Nick

jlowryduda commented 6 years ago

Hi! This is not currently available, but may be in the future. If you want to sort Numberbatch yourself, you could use wordfreq library.

ncammarata commented 6 years ago

That's a surprisingly simple answer that I hadn't thought of. Thanks!