Saving a compressed model in regular gensim format

Hi @DavidRamosSal! No, gensim doesn't support sparse models, and sparsity is the main compressive force in compress-fasttext. Thus, compress-fasttext models aren't convertible back to pure gensim format.

If you want a pure gensim model which is also small, the recommended approach is to train a small model from scratch, using either gensim or the original Fasttext package.

In terms of the original Fasttext options (https://fasttext.cc/docs/en/options.html), those that affect model size most are:

bucket: the number of trainable vectors for character n-grams. The default value is 2 million, but something as small as several thousands is already workable.
minCount: the minimal frequency for a word to be included in the vocabulary. The default value is 1, which means every single word in your dataset is included in the vocabulary; increase this value to some large integer to include only the most frequent words instead; a good threshold depends on your training dataset.
dim: dimensionality of the embedding. The default value (100) is generally fine; you can experiment with further reducing it (but it may result in a severe decrease in the downstream quality).

avidale / compress-fasttext

Saving a compressed model in regular gensim format #27