MilaNLProc / contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
MIT License
1.21k stars 147 forks source link

vocab attribute used for CoherenceWordEmbeddings removed in gensim 4.x #114

Closed rbroc closed 2 years ago

rbroc commented 2 years ago

Description

https://github.com/MilaNLProc/contextualized-topic-models/blob/f3225055440b2ebf3bedb7143868954f1e1478d7/contextualized_topic_models/evaluation/measures.py#L166

This line throws an error with gensim>=4.0.0

AttributeError: The vocab attribute was removed from KeyedVector in Gensim 4.0.0.
Use KeyedVector's .key_to_index dict, .index_to_key list, and methods .get_vecattr(key, attr) and .set_vecattr(key, attr, new_val) instead.
See https://github.com/RaRe-Technologies/gensim/wiki/Migrating-from-Gensim-3.x-to-4

There does not seem to be any method that works across both 3.x and 4.x, so this may mean having to pin gensim to 3.x, or bumping to 4.x and replacing self.wv.vocab with self.wv.index_to_key

vinid commented 2 years ago

Thanks!

This probably goes with #113. We need to update all our dependencies.

Thanks a lot!