Closed sridhardev07 closed 2 years ago
The documentations says:
This Python 3 package allows to compress fastText word embedding models (from the gensim package)
Therefore, the Facebook format is not supported. Only the gensim format is supported.
@sridhardev07 I have looked at the vectors you suggest, and they are JUST WORD VECTORS. The whole idea of FastText compression is that we reuse subword vectors more efficiently, but in the link that you provide all subword vectors are discarded.
I suggest that you use https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz instead: this model has valid subword vectors and therefore can be compressed.
For an example, please see this notebook. Or just use a tiny model for English that I have compressed: https://github.com/avidale/compress-fasttext/releases/download/v0.0.4/cc.en.300.compressed.bin.
I am trying to compress the Fasttext wiki model: link https://dl.fbaipublicfiles.com/fasttext/vectors-english/wiki-news-300d-1M.vec.zip
I tried with first approach by load_facebook_model() got the error: NotImplementedError: Supervised fastText models are not supported
and when tried with second approach of gensim: return _pickle.load(f, encoding='latin1') _pickle.UnpicklingError: invalid load key, '9'.