Slow query_predict predictions

facebookresearch / StarSpace

Learning embeddings for classification, retrieval and ranking.

MIT License

3.94k stars 531 forks source link

Currently, after training a model with StarSpace we obtain two files, one with the model and one .tsv file with a dictionary of embedding vectors. My model was trained with 300 dimensions, the vocabulary in the dictionary in the dictionary is about 300k words.

When calling the query_predict binary I must provide a basedocs file which contains approximately 300k sentences, one per line. My issue is that obtaining predictions take too long.

What exactly the binary query_predict does?
Which factors influence the speed of the predictions?
Could it be possible to just use the dictionary of vectors and use other software library like fasttext or gensim to obtain faster predictions?

facebookresearch / StarSpace

Slow query_predict predictions #172