python-wheel-build / fromager

Build your own wheels
https://fromager.readthedocs.io/en/latest/
Apache License 2.0
7 stars 11 forks source link

per-package constraints for build requirements #340

Closed dhellmann closed 1 month ago

dhellmann commented 2 months ago

A stable release branch may want to minimize the changes that go into the environment. We can already constrain globally, but some packages need different versions of the same build dependency. We could address that with per-package constraints files only for build requirements.

shubhbapna commented 2 months ago

Summarizing my discussion with @dhellmann and the next steps:

Next steps:

  1. Take advantage of the dependency graph and produce a global build requirement constraint file for whatever packages we can. if there are packages that require 2 different versions then produce a per package build requirement constraint file
  2. the graph produced in a previous bootstrap is essentially a lock file and already contains the data we need to constraint everything. maybe we just need to figure out a nice way for users to edit that file when they need to bump a version

Will attempt point 2 first. It is looking more appealing

dhellmann commented 2 months ago

Some other requirements to consider:

  1. We want the user to be able to change the constraints.
  2. We want to minimize the burden of managing those constraints. It may be simple to think about a per-package constraint file, but for a large set of packages that would mean managing a lot of files.
shubhbapna commented 2 months ago

To address making edits to the graph, we can introduce a new fromager command bump. It will follow these rules while updating the graph:

Examples: fromager bump foo -> bumps foo to the latest version for all install and build dependencies fromager bump foo <version> -> bumps foo to the for all install and build dependencies fromager bump foo --no-installation -> only bumps the build dependencies fromager bump foo --no-build -> only bumps the installation dependencies

dhellmann commented 2 months ago

To address making edits to the graph, we can introduce a new fromager command bump. It will follow these rules while updating the graph:

  • the new version is only valid is it satisfies the req specifier
  • for all req_type = install the new version must satisfy the req specifier for all the nodes of the package we are updating

The rule is a little more complex than that. If the package appears in the "installation dependency tree" of the top level dependencies (nodes connected to the top-level dependency "", with every edge in the path having type "install"), we have to keep it co-installable so all of those requirements specifiers would have to support the new version. But there are some "install" edges in the graph that are only for build-time dependencies, and those do not need to be co-installable.

The existing code that generates the constraints.txt shows how to identify the installation dependency tree.

  • for all req_type != install it is fine if there are nodes with req specifier that don't satisfy this new version. it won't update those nodes and only update the ones that satisfy it.

Examples: fromager bump foo -> bumps foo to the latest version for all install and build dependencies fromager bump foo <version> -> bumps foo to the for all install and build dependencies fromager bump foo --no-installation -> only bumps the build dependencies fromager bump foo --no-build -> only bumps the installation dependencies

The automatic bump is going to require resolving the dependency against the upstream package index, but there may be multiple requirements rules and we may not want all of them updated. Expressing that is going to be complex. Let's focus on the variations of the command where we explicitly give the new version, regardless of the edge type, for now. We can come back and add the automatic bump and edge type filtering later, if we need them.

dhellmann commented 2 months ago

Thinking more about the UX for this feature, I'm not sure what we need is a command to edit the build order from a previous run.

What if we said that during bootstrap, the build-order file would help with resolution except for the requirements specifications in the input from the CLI? The top-level requirements.

That would avoid a new step for the user, and keep the workflow for expressing versions consistent. The build-order file would be "hints" rather than constraints.

shubhbapna commented 2 months ago

So an older build order file will be passed to bootstrap right? Yes that works as well

dhellmann commented 2 months ago

So an older build order file will be passed to bootstrap right? Yes that works as well

I said build-order, but meant the graph.

The main thing I was trying to convey was instead of editing the graph before passing it, we let the top level requirements for the bootstrap override the graph.

shubhbapna commented 2 months ago

The main thing I was trying to convey was instead of editing the graph before passing it, we let the top level requirements for the bootstrap override the graph.

We will need a way to specify whether a particular top level requirement is only for build requirements right then?

shubhbapna commented 2 months ago

Also how about we actually do use an older build order file instead of the graph and apply the resolved versions from that directly? We can use the type and dist field for this. If the package is specified in the top level requirements then we override the old build order file and use are regular resolution method?

dhellmann commented 2 months ago

Also how about we actually do use an older build order file instead of the graph and apply the resolved versions from that directly? We can use the type and dist field for this. If the package is specified in the top level requirements then we override the old build order file and use are regular resolution method?

We just want the graph file not the build-order file. That was a typo.

The main thing I was trying to convey was instead of editing the graph before passing it, we let the top level requirements for the bootstrap override the graph.

We will need a way to specify whether a particular top level requirement is only for build requirements right then?

I'm thinking we'd do something like this:

  1. Add support for a flag like --previous-build graph.json.
  2. When given that graph file, we resolve each dependency the same way it was previously for that part of the build (regardless of the dependency type). To do that, as we resolve dependencies in bootstrap we have to look at the parent, find it in the graph, and short-cut the resolution process to use the same results instead of using the resolver.
  3. UNLESS the package was also specified as a top-level item, in which case we resolve that dependency normally and ignore the previous build. So, we will have to track top-level items in some special way so we can look them up again later. Maybe that just means adding a way to look them up in the graph we're building during bootstrap.
  4. And then, any time we go to resolve that same package again, regardless of the reason, if the new version built in step 3 meets the requirements and constraints for the requirement at that point in the build, then we use that version instead of the old version. Step 3 is already requiring us to short-circuit the resolution process. This step is a further short-circuit of that to not re-resolve the dependency.

That lets us start with a requirements list

main-program==4.5.6

and pick up some-dependency version 1.1.1 as a requirement, and record that in the graph.json.

Then later we can update that requirements.txt file like

some-dependency==2.2.2
main-program==4.5.6

to cause some-dependency to be updated to 2.2.2 even though we previously built 1.1.1 when we found it as a dependency of main-program.

It's not important what sort of dependency it is, as long as we can accommodate any type of dependency. That's why I keep bringing up the build dependencies as a special case.

shubhbapna commented 2 months ago

@dhellmann regarding using the top level requirements file as a way to override things:

we assume that all toplevel requirements are of type install when creating the graph: https://github.com/python-wheel-build/fromager/blob/main/src/fromager/context.py#L171

this will cause issues when we are bumping up a build requirement only but it shows up as an install requirement in the new graph.json and will also suddenly show up in the install constraint file

dhellmann commented 2 months ago

@dhellmann regarding using the top level requirements file as a way to override things:

we assume that all toplevel requirements are of type install when creating the graph: https://github.com/python-wheel-build/fromager/blob/main/src/fromager/context.py#L171

this will cause issues when we are bumping up a build requirement only but it shows up as an install requirement in the new graph.json and will also suddenly show up in the install constraint file

That's a good point. I think for this case, it'll be OK. I expect for the vast majority of cases we're going to be updating the list for something we want to install anyway. If we end up needing to update a package that is only used for builds, we can exclude it when we do the installation step.

shubhbapna commented 2 months ago

tracking the breakdown of this feature here: #365

shubhbapna commented 1 month ago

365 is done, we can close this