Using the pretrained term vectors

Certainly, you can reformat the data however you want.

One thing I've found is that it's impractical to maintain downloadable releases of every format someone might need. It's also expensive: each separate download has to remain stored on a server for a long time so that links don't break. So when people want just the vectors, I provide them as the lowest common denominator, the word2vec/fastText format.

If you use the vectors via the conceptnet5 repository, you'll be working with them in the efficient HDF5 format. (You'll also get the benefit of using the ConceptNet graph to extend the vocabulary, which you can't get from the vectors alone.) But there isn't yet a good tutorial on how to work with the data in this form.

The best place for questions that are not bug reports, by the way, is the Gitter chat: https://gitter.im/commonsense/conceptnet5

commonsense / conceptnet-numberbatch

Using the pretrained term vectors #46