Closed pzelasko closed 6 years ago
Hi! The problem with sorting Numberbatch is that it includes phrases and it's difficult to find information on their frequencies. However, if you're just interested in words, you could sort it yourself using wordfreq
library.
That's a fair point. Thanks for your suggestion, I'll try it out.
Hi guys! Do you think that you could provide the
conceptnet-numberbatch
embeddings sorted by some kind of word frequency, similarly asGloVe
andFastText
does? In my research I'm limiting the vocabulary to most frequent K words in order not to eat all the GPU memory with embedding lookup when using pretrained embeddings in my models, and the sort order used by the other embeddings makes this much easier.