astral-sh / uv

An extremely fast Python package and project manager, written in Rust.
https://docs.astral.sh/uv
Apache License 2.0
25.4k stars 739 forks source link

Make metadata caching of local projects opt-in when it is dynamic? #7282

Open sbidoul opened 1 month ago

sbidoul commented 1 month ago

I regularly hit caching gotchas with projects using dynamic dependencies.

The last one, I think, was uv lock not detecting changes even though tool.uv.resinstall-package was set. I had to use uv lock --refresh-package, and I could not set tool.uv.refresh-package.

These surprises drive my ask for e.g. https://github.com/astral-sh/uv/pull/7268 But even with that we'll still need to think about setting cache-keys in the first place.

I was wondering if it would make sense to cache less aggressively if any metadata is dynamic (either because the project table is absent, or project.dynamic is set)? Or not cache at all in ~such cases~ presence of dynamic metadata, unless cache-keys is set?

charliermarsh commented 1 month ago

The breaking point for me on this was uv run, which becomes very hard to use with dynamic projects if we're rebuilding and reinstalling them (typically needlessly) on every command. I don't want to have different caching semantics for uv run vs. other commands, so I decided to shift the burden towards requiring users to explicitly mark packages that they want to "always rebuild, no matter what".

An alternative is that we could consider adding a dedicated setting to the pyproject.toml like tool.uv.dynamic to opt a given package into this behavior (rather than having to add reinstall-package somewhere else).

sbidoul commented 1 month ago

Or could uv run warn that 'the project is being rebuilt because there is dynamic metadata, consider setting cache-keys`?

So it would be correct by default, with an opt-in mechanism for caching.

Most people, who don't use dynamic metadata, would never see the warning. And for people using dynamic metadata that would be a friendly hint to configure their project for correct caching.

sbidoul commented 1 month ago

tool.uv.dynamic

UX-wise is it not a bit strange, since there is already project.dynamic ?

sbidoul commented 1 month ago

And by the way, sorry to appear insistent on this. The thing is that I'm a heavy user of dynamic metadata, but not a packaging newbie, and since I regularly fall in the traps, I suspect there is some UX issue that might be important.

I may not have said it yet, but I hugely appreciate the fantastic work you are doing!

charliermarsh commented 1 month ago

Yeah, the answer here may indeed be to have different semantics for uv run vs. other commands. I'm worried that it would be confusing for users, but perhaps it's more confusing as-is. (Thank you for the kind words and no worries at all. I'm not happy with the state of dynamic metadata in uv either :))

mmerickel commented 3 weeks ago

As the author of #2844 which is pulled into here - I'm just frankly surprised that uv run has so many side effects such that it would be a blocker for decisions here. My focus is on uv sync and it's "obvious" to me that uv sync should reinstall packages that are installed in editable mode every single time uv sync is executed. That's what editable means, and how pip install -e has worked forever. I'm completely fine with uv run having different semantics here because it's simply not a command that should guarantee that everything is perfectly up to date in my mind. It's cool if it can provide some sanity checks or basic things, but uv sync is the "one true thing that should get the env in the right state" and that means that it should reinstall editable packages.

mmerickel commented 3 weeks ago

To extend on that previous point a little bit with some anecdotes... the current recommended solution is for me to define tool.uv.cache-keys on each project, and in my case that means basically a cache-key of **/* which is going to run into performance problems eventually for uv run as well so we're kind of back where we started where uv run is just dictating too much of the contract here imo.

charliermarsh commented 3 weeks ago

Is it that you want all editables to be reinstalled every time? Or editables with dynamic metadata?

charliermarsh commented 3 weeks ago

But in that case, why not just use reinstall-package for each of those packages?

mmerickel commented 3 weeks ago

All editable packages imo. Python doesn't offer a separation for this aspect of the problem to claim it's just metadata. The issue here is with MANIFEST.in that is compiled down at build-time into the list of assets to copy into the project, and that only is matched AFTER the build happens and all the files are generated such that MANIFEST.in can match on them.

In my concrete example we have a webassets folder living externally to the python package. We have a custom setuptools hook that will compile the assets and put them into the python package in the right location. Then we have a MANIFEST.in that says to pull those assets in as package-data. And our devs need that to happen every time they run uv sync after a git pull or changing the files themselves locally.

mmerickel commented 3 weeks ago

I can do it with cache-keys to be clear but it's globbing a lot of files that 1) I have to tediously maintain and 2) could cause performance problems for things like uv run as those globs get too large. I'd rather a default where I didn't need cache-keys, and uv run did what it does today (where it can be slightly out of sync because I didn't define cache-keys) but uv sync reinstalled editable packages and always left me in a good spot.

mmerickel commented 3 weeks ago

I confirmed that tool.uv.cache-keys does work as we thought it would. It does have the downsides I mentioned above and does require each project to define how it should work with uv or the wrong thing happens. My desired behavior would be a different default. Basically if you don't define tool.uv.cache-keys on a project and it's editable then uv sync should reinstall it. If you do define cache-keys then that is a fast-path for uv to use.