Open carderne opened 2 weeks ago
To expand on the Docker image, this is what I would want to do:
FROM python:3.12.5-slim-bookworm AS python-builder
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
# Create a venv at a well-known location so it can be COPY'd later
RUN uv venv /opt/python
# Tell uv to use that venv
ENV UV_PYTHON=/opt/python
WORKDIR /app
COPY uv.lock pyproject.toml /app/
# No need to COPY pyproject.toml of libs - they're all well-specified in uv.lock anyway
# Install the app without all workspace members - ie all 3rd party dependencies
RUN uv sync --locked --no-install-workspace --package=server
COPY packages /app/packages
# Install 1st party dependencies, but only those that are needed
# Also pass the fictional `--no-editable` flag to actually bundle them into the venv
RUN uv sync --locked --no-editable --package=server
FROM python:3.12.5-slim-bookworm AS runtime
# Copy the venv that has all 3rd party and 1st party dependencies, ready for use
COPY --from=python-builder /opt/python /opt/python
ENV PATH="/opt/python/bin:$PATH"
I can't do that because:
uv sync --locked --no-install-workspace --package=server
complains because server
isn't there (nor are its dependencies anyway)
uv.lock
already has all the information needed to resolve this: it contains workspace members, so uv can know of server
, and of its dependencies, without all pyproject.toml
files needing to be there--no-editable
- uv will install workspace members as editable packages, so COPY
ing the venv in the final stage won't work because the packages pointed at won't be there
uv sync
doesn't support targetting a venv (although that's under discussion from what I've gathered)(1) is easy to resolve, would that help?
(1) Yes, that would be great! (I'll start working on a patch but I suspect I'll still be noodling by the time you merge yours.)
For (2), I suspect the only generally useful solution would be to encode the package-specific dependency tree in uv.lock
(like pnpm-lock.yaml
) rather than calculating it on the fly. That might make it harder to dovetail with PEP 751, but from what I understand you're planning to support pylock as an output format that uv won't use internally, so maybe not important.
For (2), we're thinking of perhaps a dedicated command like uv bundle
that would handle a lot of the defaults that you want for this kind of workflow. But otherwise a --no-editable
or similar seems reasonable to me.
Lets track (2) in https://github.com/astral-sh/uv/issues/5792.
I think adding a tool.uv.virtual: bool flag (like Rye has) would be a great step. In that case the root is not a package and can't be built.
How is this different than tool.uv.package = false
?
I think that does what you're describing?
--frozen --package
.Sorry you're moving too quickly for me!
You're right that package=false
does what is needed. It allows a very minimal root pyproject.toml
that looks like the one below. The only downside is that in order for uv sync
to sync the entire workspace, you need to add each package to project.dependencies
and to tool.uv.sources
and in tool.uv.workspace.members
. I should have been more explicit in my first message that what I think is needed here is uv sync --the-entire-workspace
. (This is the default behaviour in Rye and was the default in uv<0.4.0
.)
Alternatively a more explicit flag in the config like tool.uv.workspace.this-project-is-virtual-so-sync-all-members-by-default: bool
.
[project]
name = "monorepo-root"
version = "0"
requires-python = "==3.12"
dependencies = ["mylib", "myserver"]
[tool.uv]
dev-dependencies = []
package = false
[tool.uv.sources]
mylib = { workspace = true }
myserver = { workspace = true }
[tool.uv.workspace]
members = ["packages/mylib", "packages/myserver"]
I don't really understand how #6943 helps but seems sensible anyway. I see three obvious ways (not uv specific) of getting stuff into a Docker image:
requirements.txt
, install those, then COPY in all needed packages.requirements.txt
. Then create a site-packages
and COPY that in. I assume this is what the --non-editable
is about in #5792.requirements.txt
. Then create sdists/wheels from the packages (the plugin I mentioned).All of these require a little pre-Docker script to generate the requirements.txt
which isn't ideal but fine. Assuming I've understood correctly on (2) above then I'll move any more comments I have to that Issue.
For (2), I thought you wanted to do this:
FROM python:3.12.5-slim-bookworm
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
WORKDIR /app
COPY uv.lock pyproject.toml /app/
# NB: doesn't work as the server package isn't there!
RUN uv sync --locked --no-install-project --package=server
COPY packages /app/packages
RUN uv sync --locked --package=server
ENV PATH="/app/.venv/bin:$PATH"
This now works as expected if you use frozen
rather than locked
.
This is also causing some issues for me with 0.4.0+. Locally sync works fine
> uv sync
Resolved 341 packages in 76ms
Audited 307 packages in 3ms
But when adding --frozen
, which we use in CI, uv ignores the workspace members
> uv sync --frozen
Uninstalled 97 packages in 7.57s
...
Audited 210 packages in 0.25ms
The different dependency resolution behavior depending on whether I pass --frozen
is unexpected.
Does your root pyproject.toml
have a [project]
section?
No, just a "virtual" workspace, effectively this.
[tool.uv]
dev-dependencies = [
"...",
]
[tool.uv.workspace]
members = ['libs/*', 'sandbox']
I can look into why you're seeing differences (it sounds like a bug!). I'd suggest migrating to a virtual project though, i.e., adding a [project]
table (but not a build-system
) to your root pyproject.toml
. We redesigned those in v0.4.0 and the version above is now considered legacy.
Adding the [project]
section as suggested now shows consistent behavior with or without --frozen
. I was able to get back to the desired sync behavior by adding the workspace members to the project dependencies and a [tool.uv.sources]
section enumerating the workspace members. More verbose, but more consistent. Thanks for the help!
Great! Still gonna see if I can track down and fix that bug :)
What @b-phi is talking about is exactly what I mentioned in (1) of my comment up above. Basically you have to add each workspace member in three places. Would be great if that could be made unnecessary (in one of the ways I suggested or some other way).
On (2) the Dockerfiles, the command you added helps, but it still doesn't work if there arae dependencies between packages and you haven't yet copied in the files. There's an MRE here. It fails when trying to run the --no-install-project
sync because packages/server
wants packages/greeter
but it's not there. Currently the only way around this (afaict) is to pre-export a requirements.txt
and use that.
I'm confused on (2). We have --no-install-workspace
that does exactly this, right?
Oh of course, sorry. So (2) I think is resolved. The remaining stuff about getting the right files into the Dockerfile are not really uv's problem. (Although could be helped by stuff like --non-editable
.)
The main point of this issue is (1) but I'm very happy to wait for you to figure out an approach that you're happy with. But I think it would be great to resolve.
👍 Part of what I'm hearing here too is that we need more + better documentation for this stuff.
Yeah I don’t blame you, it’s moving really fast.
EDIT: adding this here to make it clear to any future travellers why this issue is still open.
The question is whether the sync
command could have an --all-packages
command added (or some similar name).
👍 Part of what I'm hearing here too is that we need more + better documentation for this stuff.
I'm probably biased, but it seems to me that a monorepo with possibly interdependent libs, and independently buildable (most of the time into Docker images) apps is a common pattern - at least it's what workspaces promote. With that in mind, it would indeed be great to have documentation about how Astral intends us to use uv to manage such a repo and such builds. Until now, it feels like I'm hacking my way to a satisfying set-up, although uv maintainers obviously have a "right way" in mind.
That said, I must say I'm having an amazing experience with uv (and ruff, and Astral in general), and that I'll advocate to use it in all the projects I maintain!
@Afoucaul Is there anything else you think is missing apart from a sync --all-packages
(if you agree that is needed) and improved monorepo/workspace docs?
Is it possible for a package, virtual project or workspace to depend on another workspace, or on a package in a workspace?
I'm thinking of the case common in data science where we have a set of packages developed in a workspace (let's say numpy and scipy are the packages developed in WRKSPC) and we don't really publish them to a repository or anywhere.
At some point I want to start a data science project, so I will create a virtual package with some scripts that require scipy
, which in turn depends on the workspace version of numpy.
How can I express this dependency?
@Afoucaul Is there anything else you think is missing apart from a
sync --all-packages
(if you agree that is needed) and improved monorepo/workspace docs?
Jumping in here, managing multiple environments would be very helpful. In our repo, some sub-packages have heavy ML dependencies, others have linux-only dependencies. Ideally I would be able to manage multiple environments for different use cases, e.g. lightweight venv on OSX host, a linux venv that I use via docker, a heavier ML env etc.
@Afoucaul Is there anything else you think is missing apart from a
sync --all-packages
(if you agree that is needed) and improved monorepo/workspace docs?Jumping in here, managing multiple environments would be very helpful. In our repo, some sub-packages have heavy ML dependencies, others have linux-only dependencies. Ideally I would be able to manage multiple environments for different use cases, e.g. lightweight venv on OSX host, a linux venv that I use via docker, a heavier ML env etc.
I've managed to do that by defining apps as packages (that you target with --package
), and extras.
For instance, I've created an ai
package that needs tensorflow, which I added with uv add --package ai --optional ml extra
. That way, a package that needs ai
but never actually reaches the part where tensorflow is imported, can depend on it via uv add --package consumer ai
, whereas a package that actually needs that would declare it via uv add --package consumer ai[ml]
(note ai
vs ai[ml]
).
That's actually very useful to install a venv on an ARM macbook for a project that needs tensorflow somewhere - you run uv sync
without --extra ml
, so you don't end up with tensorflow
, but everything else - good enough for developing.
Then in your actual runtime, you do uv sync --all-extras
(assuming all extras are prod, all dev deps are declared as such) to get everything you need.
If you need very specific environments that are orthogonal to apps, you could create one with uv init environments/my-env
, add deps via uv add --package my-env ai
, and then uv sync --package my-env
.
@Afoucaul Is there anything else you think is missing apart from a
sync --all-packages
(if you agree that is needed)
I've resolved that point by adding all local packages to the root package (uv add foo
where foo
is a workspace member), but I do agree it's error prone and requires an extra command each time you create a new package.
Is it possible for a package, virtual project or workspace to depend on another workspace, or on a package in a workspace?
I'm thinking of the case common in data science where we have a set of packages developed in a workspace (let's say numpy and scipy are the packages developed in WRKSPC) and we don't really publish them to a repository or anywhere.
At some point I want to start a data science project, so I will create a virtual package with some scripts that require
scipy
, which in turn depends on the workspace version of numpy. How can I express this dependency?
There's only one lockfile, so if at the root of your monorepo you run uv init projects/testing-around-some-stuff
then uv add --package testing-around-some-stuff scipy
you'll end up with the workspace's scipy. There's some caveats though, if you try to use in testing-around-some-stuff
a different version of some package that's already specified in the uv.lock
: either you'd be unable to do so because of the set of constraints, or you could and that would update that package's version for the whole workspace - not ideal either.
I'm not sure how one would create a project in a workspace and specify that it should always respect the workspace's requirements and never change them.
One thing preventing us from switching over our monorepo to uv is that its really hard to tell in CI which projects in a workspace actually changed when uv lock changes.
We have many apps deployed from a single monorepo and don't want to have to build docker images for all of them every time uv.lock changes (e.g. someone adding a new project or library to the workspace)
@rokos-angus one way around that would be to have a git-hook/CI step/something that runs uv export ...
for each package and you diff those files to see what needs to be built.
where i work we use http://github.com/josh-project/josh to figure out what changed (disclaimer: i'm a contributor in that project)
with that said I think CI is a separate problem. when it comes to python, here's what i've been able to narrow down the requirements to:
1) it should be possible to make packages depend on packages with relative paths 2) at the same time, it should be possible to build packages for a scenario where they are eventually uploaded to a package registry, so relative paths won't work there 3) every package has multiple sets of dependencies ("environments"). this can be either due to feature switches or for example because CI needs more deps to run tests
how we solved is for us is a custom script connected via https://github.com/recogni/setuptools-monorepo that resolves those dependencies in a desired way depending on context (for example either to file://
path or to a name in the registry). this way we can have monorepo but we can also publish wheels from this monorepo and ship them. but i would really like to see a more "native" solution
agree to the point of having a shared lockfile, this is often a pain point
@vlad-ivanov-name I'm slowly working on something similar at https://github.com/carderne/una albeit uv-specific and Hatch not setuptools. It figures out where to find files using uv's { workspace = true }
config rather than the URL.
I haven't really thought about your point (2). Nor much for (3), but my assumption is that for testing you'd use uv sync
and for deployment use the plugin.
I've compiled an example that works for my purposes that might help some folks looking for a monorepo setup using uv.
@JasperHG90 that link is 404 for me.
https://github.com/DavidVujic/python-polylith-example-uv is another example which I think supports this or similar use cases. @DavidVujic
https://github.com/DavidVujic/python-polylith-example-uv is another example which I think supports this or similar use cases.
Thanks for the mention!
Yes, if I have understood the things talked about in this issue correctly I think that Polylith in combination with uv
might be helpful. It's an architecture for monorepos originating from the Clojure community. There's tooling support, and I'm the maintainer of the Python tooling. It works well with uv
and here's the docs if you want to know more.
I made https://github.com/JuanoD/uv-mono as an example repo. Feel free to correct me if something is wrong
@JasperHG90 that link is 404 for me.
Sorry was ill these past days 🦠. Is fixed now! Thanks for the heads up.
I've put a decent amount of effort trying to figure out a workable "monorepo" solution with pip-tools/Rye/etc and now uv. What I mean by a monorepo:
I'm packaging a few thoughts into this issue as I think they're all related, but happy to split things out if any portions of this are more likely to be worked on than others.
Should uv support this?
I think yes. Pants/Bazel/etc are a big step up in complexity and lose a lot of nice UX. uv is shaping up as the defacto Python tool and I think this is a common pattern for medium-sized teams that are trying to move past multirepo but don't want more sophisticated tooling. If you (uv maintainers) are unconvinced (but convince-able), I'm happy to spend more time doing so!
Issues
1. Multiple packages with single lockfile
Unfortunately, uv
v0.4.0
seems to be a step back for this. It's no longer possible touv sync
for the whole workspace (related #6874), and the root project being "virtual" is not really supported. The docs make it clear that uv workspaces aren't (currently) meant for this, but I think that's a mistake. Have separate uv packages isn't a great solution, as you lose the global version locks (which makes housekeeping 10x easier), so you have multiple venvs, multiple pyright/pytest installs/configs etc.For clarity, I'm talking about the structure below. I think adding a
tool.uv.virtual: bool
flag (like Rye has) would be a great step. In that case the root is not a package and can't be built.2. Distributing in Dockerfiles etc
This is I think orthogonal to the issue above. (And much less important, as it's possible to work around it with plugins.) Currently, there's no good way to get an efficient (cacheable) Docker build in a uv workspace. You'd like to do something like the Dockerfile below, but you can't (related #6867).
If that gets resolved, there's another issue, but this is very likely to be outside the scope of uv. Just sharing it for context.
packages/
directory into every Dockerfile (regardless of what they actually need), forcing tons of unnecessary rebuilds.My own solution has been to build wheels that include any dependencies so you can just do this:
Then in Dockerfile:
I've written a tiny Hatch plugin here that injects all the required workspace code into the wheel. This won't work for many use-cases (local dev hot reload) but is one way around the problem of COPYing the entire workspace into the Dockerfile. I don't think there's any solution that solves both together, and at least this way permits efficient Docker builds and simple Dockerfiles. (Note: since uv v0.4.0 the plugin seems to break uv's editable builds, haven't yet looked into why.)