Closed afuetterer closed 4 months ago
I love the rustification of all these packages! Haven't used uv
myself but this much of a speed up is more than welcome. The only thing I need to be confirmed is that the process of installing BERTopic is technically not the same as through pip
. Users are advised using pip
directly to install BERTopic which might lead to a different set of dependencies although I doubt this will be the case in practice.
Having said that, when we test using uv
but users install using pip
are we then sure the use case of users (pip
) is properly tested?
If you look at a car or a bike, rust is usually a bad thing. But in this case, it is really great with the massive performance improvements.
Actually I am not sure about your question. Your package requires pandas>=1.1.5
, which is a really old lower bound. If pip and uv use different resolver strategies, it might indeed lead to different versions of e.g. pandas. But is that a big problem?
I don't know, its your call.
If users use uv pip install bertopic
, they will have faster install times, even if you don't use it in CI.
Actually I am not sure about your question. Your package requires pandas>=1.1.5, which is a really old lower bound. If pip and uv use different resolver strategies, it might indeed lead to different versions of e.g. pandas. But is that a big problem?
It might be a problem if one of those versions results in bugs that we haven't tested through the unit tests. If uv
gives a different version of pandas than pip
, then it is technically possible that BERTopic installed with pip
might be broken somewhere that we haven't detected yet. Over the last years, BERTopic has been broken a couple of times with newer versions of dependencies whether it was due to API changes (even when it weren't major new versions) or due to wheels that were not correctly built.
If users use uv pip install bertopic, they will have faster install times, even if you don't use it in CI.
That's true, the speed-up is definitely there. I am just wondering how many users will actually be doing this. In my experience, >90% of users will just run pip install bertopic
also because many users are not that familiar with python.
I'm gonna be honest with you here, I'm not sure what the chance actually is that the testing pipeline using uv
is different from a bare pip
and the extend to which. That makes it also difficult for me to understand the actual impact.
Yes, sure, things can break, if the API of a dependency changes, and there is no upper bound in place.
It's your decision. Happy to close this, if not desired. Just wanted to show the difference in installation time.
Let's close it for now and re-open when we run into issues with the speed up of the pipeline or if uv
becomes more of a staple amongst most users.
Sure, no problem.
What does this PR do?
This is a proposal to use the uv for installing python packages in CI. It concurrently downloads the packages from pypi and improves the time of the install step significantly. From 3+ minutes to 30+ seconds.
What do you think?
Before submitting