piskvorky / gensim

Topic Modelling for Humans
https://radimrehurek.com/gensim
GNU Lesser General Public License v2.1
15.7k stars 4.38k forks source link

Reduce gensim surface area #2852

Closed mpenkov closed 3 years ago

mpenkov commented 4 years ago

Are there subpackages/submodules that we're not maintaining anymore, and could remove?

The goal is to reduce the maintenance burden of the project.

piskvorky commented 4 years ago

@mpenkov any other modules to be removed?

I'd like to finalize this even before the pre-release, so the documentation is up-to-date.

Plus document what was removed + why, either here or in the migration notes. Thanks.

piskvorky commented 4 years ago

PoincareModel is another such candidate. Not touched by the recent refactoring and upgrades, it's probably out of date and useless.

mpenkov commented 3 years ago

@piskvorky I had a look at Poincare model. Yes, it's old and hasn't changed much recently, but it's fairly well tested (I eyeballed the tests and they make sense to me, and they all pass). Do you really want to remove it?

piskvorky commented 3 years ago

No, if it's good, we can keep it.

I don't remember many user complaints / bug reports about the Poincare model, although that may be simply because nobody uses it.

mpenkov commented 3 years ago

@piskvorky Perhaps we can remove just the visualization subpackage? That doesn't look like it belongs in a library like gensim

https://github.com/RaRe-Technologies/gensim/tree/develop/gensim/viz

piskvorky commented 3 years ago

viz seems to contain a single module, with two functions related to Poincare embeddings. So if we keep Poincare, maybe move those two functions to its module? (without introducing new dependencies, so import-on-demand for external libs)

Or move those functions to the particular documentation / tutorial that needs viz? Or remove them altogether? I really don't know the context, sorry. But it's a silly subpackage for sure.

mpenkov commented 3 years ago

The viz stuff is untested and requires plotly as a dependency (which is in setup.py, meaning all installs include it). My suggestion is to remove viz altogether, and remove the dependency. Visualization is too use case-specific, everyone will want to do it differently anyway.

piskvorky commented 3 years ago

Yes, but why is it there? Used by some tutorial / notebook / guide?

mpenkov commented 3 years ago

As far as I can tell, it isn't used or mentioned anywhere outside of the automatically generated documentation.

piskvorky commented 3 years ago

Hm, double weird. I guess we can just remove it then.

requires plotly as a dependency (which is in setup.py, meaning all installs include it)

You mean all Gensim installs require plotly, because of viz? I hope not.

mpenkov commented 3 years ago

No, on second examination, only docs builds require it, but nevertheless, it isn't something we need to keep.

mpenkov commented 3 years ago

OK, I think I've addressed all the candidates we discussed the last time we talked about this. Is there anything else we need to remove?

mpenkov commented 3 years ago

Misha TODO

mpenkov commented 3 years ago

Related: #3077