avidale / compress-fasttext

Tools for shrinking fastText models (in gensim format)
MIT License
165 stars 13 forks source link

gensim 4.0.0b0 #5

Closed poke1024 closed 2 years ago

poke1024 commented 3 years ago

At the moment, this seems not to work with gensim 4.0.0. Any plans to fix this?

import compress_fasttext
  File "/opt/miniconda3/envs/vectorian2021/lib/python3.8/site-packages/compress_fasttext/__init__.py", line 1, in <module>
    from compress_fasttext import compress, decomposition, evaluation, navec_like, prune, quantization, utils
  File "/opt/miniconda3/envs/vectorian2021/lib/python3.8/site-packages/compress_fasttext/compress.py", line 7, in <module>
    from .prune import prune_ngrams, prune_vocab, count_buckets, RowSparseMatrix
  File "/opt/miniconda3/envs/vectorian2021/lib/python3.8/site-packages/compress_fasttext/prune.py", line 7, in <module>
    from gensim.models.utils_any2vec import ft_ngram_hashes
ModuleNotFoundError: No module named 'gensim.models.utils_any2vec'
avidale commented 3 years ago

Yes, thank you for the report! I'll try to update it

desilinguist commented 2 years ago

Any updates on this! We were hoping to use this awesome library in our stack but we are using v4+ of gensim.

avidale commented 2 years ago

Actually, the migration to gensim 4.0.0+ is turned to be harder than expected. Gensim has completely rewritten the internal API of their models, and it will require lots of duct tape to adapt the library to these changes while still supporting the models that have already been compressed and published.

@poke1024 @desilinguist, I would appreciate if you helped to write or at least review a pull request with this migration: https://github.com/avidale/compress-fasttext/pull/7

avidale commented 2 years ago

You can test new Gensim by pip install git+https://github.com/avidale/compress-fasttext@gensim-4, but please make sure that you are using the models that are supported by new Gensim.

desilinguist commented 2 years ago

I'll be happy to contribute and/or review.

avidale commented 2 years ago

@desilinguist @poke1024 I have updated the library: its 0.1.0 version is based on gensim>=4.0.0. However, for some models previously compressed, the backward compatibility is lost. You still can use them with compress-fasttext==0.0.7.

Please give a try to the new version and open new issues if something undesirable happens.