avidale / compress-fasttext

Tools for shrinking fastText models (in gensim format)
MIT License
165 stars 13 forks source link

Saving a compressed model in regular gensim format #27

Open DavidRamosSal opened 2 weeks ago

DavidRamosSal commented 2 weeks ago

Hi, is there a way to save a compressed model in regular gensim format? I can't install compress-fasttext where my application will run, so being able to run model.most_similar("word") only with gensim would be great.

Thanks in advance!

avidale commented 2 days ago

Hi @DavidRamosSal! No, gensim doesn't support sparse models, and sparsity is the main compressive force in compress-fasttext. Thus, compress-fasttext models aren't convertible back to pure gensim format.

If you want a pure gensim model which is also small, the recommended approach is to train a small model from scratch, using either gensim or the original Fasttext package.

In terms of the original Fasttext options (https://fasttext.cc/docs/en/options.html), those that affect model size most are: