MaartenGr / BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.
https://maartengr.github.io/BERTopic/
MIT License
5.99k stars 752 forks source link

Switch from setup.py to pyproject.toml #1969

Closed afuetterer closed 3 months ago

afuetterer commented 4 months ago

The pyproject.toml file is decribed in PEP 621. It is the recommended way to package Python projects these days.

Please have a look at my fork's version of this file: https://github.com/afuetterer/BERTopic/blob/1969-toml/pyproject.toml

You are still able to do a local pip install . or pip install bertopic[gensim] from pypi.

Would you accept a PR to convert BERTopic to use pyproject.toml? Please let me know what you think.

Refs:

MaartenGr commented 4 months ago

Thanks for the suggestion! What would you think is the main added benefit of doing so in the context of this package? Also, I remember it did not support editable installs a while back. Any idea if that's still the case?

afuetterer commented 4 months ago

First of all: Editable installs are possible, I use them all the time.

I think the added benefits would be:

e.g.

dev = [
    "bertopic[docs,test]", # <- no repeating of mkdocs, pytest etc
]
docs = [
    "mkdocs==1.5.3",
    "mkdocs-material==9.5.18",
    "mkdocstrings-python==1.10.0",
    "mkdocstrings==0.24.3",
]
test = [
    "pytest>=5.4.3",
    "pytest-cov>=2.6.1",
]

From llama3's point of view:

What do you think? Should I submit a PR?

MaartenGr commented 4 months ago

First of all: Editable installs are possible, I use them all the time.

Ah, you're right! It seems that the last time I tried was quite a while ago...

What do you think? Should I submit a PR?

Yeah, sounds great! Would be a nice experience working with it in this project. It's not related to pyproject.toml specifically but at some point, I would love to have some sort of pip install bertopic[minimal] that only contains the very minimal amount of dependencies (even removing HDBSCAN and UMAP). But that is a bit out-of-scope and would require major changes...

From llama3's point of view:

I generally tend to steer away from LLM-based opinions unless they are backed by expert-opinions. But thanks for sharing it!

afuetterer commented 4 months ago

Alright, let me submit a PR then. I just "translate", what is given in the setup.py, modifications of dependency groups should be a separate endeavor, I think.

I found some of the points from llama3 valid, and just wanted to add them here for "completeness". For example I think the "readability" point is fair.

afuetterer commented 3 months ago

Done via #1978