python-wheel-build / fromager

Build your own wheels
https://pypi.org/project/fromager/
Apache License 2.0
3 stars 9 forks source link

How to use Fromager to solve the bootstrapping problem? #199

Open jaraco opened 1 month ago

jaraco commented 1 month ago

In https://github.com/pypa/packaging-problems/issues/342#issuecomment-2220790312, it's been suggested that Fromager can solve the bootstrapping problem because it allows for prepare_build to pull built wheel artifacts. It's unclear to me, however, how Fromager breaks the cyclic dependency problem. Stated simply, the bootstrapping problem is that if a build tool depends on itself or any package that requires that build tool to build, there's no way in general to build entirely from source. There's no order in which all packages can be built purely from source.

A few examples:

Currently, none of these scenarios break the build-from-source world because:

These assumptions are untenable because they impose undue constraints on build tools and their dependencies.

In https://hackmd.io/@jaraco/SJSQ40tv0, I'm proposing a methodology that system integrators would need to adopt to break the cycle and allow build tools to declare dependencies. The tl;dr is that the integrator needs to provide pre-built artifacts for all supported backends (including their dependencies) and it can't expect to build those from source.

Reading the bootstrapping mode, it seems that fromager only has support for building when there are no cyclic dependencies.

all dependencies are being built in the correct order.

What does Fromager do when a package depends on itself, directly or recursively, at build time? Does Fromager allow the private wheel index to be pre-seeded with "trusted" pre-built artifacts that can break the cycle?

I'm going to try some experiments to verify, but I'm interested in this project's maintainers' insights.

dhellmann commented 1 month ago

Currently, none of these scenarios break the build-from-source world because:

  • Setuptools doesn't declare its dependencies and is forced to vendor them.
  • coherent.build isn't popular enough that any system integrators yet care.
  • hatchling hasn't adopted any dependencies that have adopted hatchling.

These assumptions are untenable because they impose undue constraints on build tools and their dependencies.

In https://hackmd.io/@jaraco/SJSQ40tv0, I'm proposing a methodology that system integrators would need to adopt to break the cycle and allow build tools to declare dependencies. The tl;dr is that the integrator needs to provide pre-built artifacts for all supported backends (including their dependencies) and it can't expect to build those from source.

I find it hard to imagine any system integrator accepting that constraint as part of building a modern secure software delivery pipeline.

Reading the bootstrapping mode, it seems that fromager only has support for building when there are no cyclic dependencies.

Yes, that's correct. PEP 517 (I think? maybe another standard) implied that build backends were expected to avoid introducing cyclic dependencies. If that's not actually the case, I'd have to see the actual dependency cycle to know how I would address it.

all dependencies are being built in the correct order.

What does Fromager do when a package depends on itself, directly or recursively, at build time? Does Fromager allow the private wheel index to be pre-seeded with "trusted" pre-built artifacts that can break the cycle?

Pre-seeding the wheel server is one option we've considered, but we haven't needed to do that, so far.

I'm going to try some experiments to verify, but I'm interested in this project's maintainers' insights.

This isn't really a problem we've been trying to solve directly. That said, we've hit on a few techniques.

The flit_core test is one example of an approach to solving that problem: https://github.com/python-wheel-build/fromager/blob/main/e2e/flit_core_override/build/lib/package_plugins/flit_core.py There we invoke the build using the manual instructions of flit_core, even though I think it does work with PEP 517 and it wouldn't be needed.

It's an example, though, of how the person using fromager can provide a plugin to do whatever is needed for most points in the process.

In a couple of cases we add pyproject.toml files to source directories to make the PEP 517 approach work because the project's setup.py imports something that is expected to be manually installed before installing the package in question. https://github.com/Dao-AILab/flash-attention/pull/958 is one example of that.

We're also building several things that rely on cmake, and for those we remove the dependency and rely on the version of cmake provided with the OS.

In each case it requires some work to set up the plugin or patch or whatever. But we're currently building several hundred packages, and only on the order of 10 have these customizations. Most of those are things for which there are no sdists on pypi.org at all.

jaraco commented 1 month ago

I tried using fromager to build coherent.build, since it provides the most straightforward cyclic dependency (on itself and its dependencies). The build failed thus:

``` (.venv) draft 🐚 py -m fromager bootstrap coherent.build bootstrapping 'cpu' variant of [('toplevel', 'coherent.build')] coherent.build: * handling toplevel requirement coherent.build [] saved /Users/jaraco/draft/sdists-repo/downloads/coherent_build-0.19.1.tar.gz coherent.build: new toplevel dependency coherent.build resolves to 0.19.1 coherent.build: preparing source for coherent.build from /Users/jaraco/draft/sdists-repo/downloads/coherent_build-0.19.1.tar.gz coherent.build: prepared source for coherent.build at /Users/jaraco/draft/work-dir/coherent_build-0.19.1/coherent_build-0.19.1 coherent.build: getting build system dependencies for coherent.build in /Users/jaraco/draft/work-dir/coherent_build-0.19.1/coherent_build-0.19.1 coherent.build: ** handling build-system requirement coherent.build [('toplevel', , )] ['/Users/jaraco/draft/.venv/bin/python', '-m', 'pip', '-vvv', 'install', '--disable-pip-version-check', '--upgrade', '--only-binary', ':all:', '--index-url', 'http://localhost:58474/simple/', '--trusted-host', 'localhost', 'coherent.build'] failed with Using pip 24.1.2 from /Users/jaraco/.local/pip-run/pip (python 3.12) Non-user install because user site-packages disabled Created temporary directory: /private/var/folders/f2/2plv6q2n7l932m2x004jlw340000gn/T/pip-build-tracker-ea8_g6bl Initialized build tracking at /private/var/folders/f2/2plv6q2n7l932m2x004jlw340000gn/T/pip-build-tracker-ea8_g6bl Created build tracker: /private/var/folders/f2/2plv6q2n7l932m2x004jlw340000gn/T/pip-build-tracker-ea8_g6bl Entered build tracker: /private/var/folders/f2/2plv6q2n7l932m2x004jlw340000gn/T/pip-build-tracker-ea8_g6bl Created temporary directory: /private/var/folders/f2/2plv6q2n7l932m2x004jlw340000gn/T/pip-install-d7pihs1c Created temporary directory: /private/var/folders/f2/2plv6q2n7l932m2x004jlw340000gn/T/pip-ephem-wheel-cache-gonzxfer Looking in indexes: http://localhost:58474/simple/ 1 location(s) to search for versions of coherent-build: * http://localhost:58474/simple/coherent-build/ Fetching project page and analyzing links: http://localhost:58474/simple/coherent-build/ Getting page http://localhost:58474/simple/coherent-build/ Found index url http://localhost:58474/simple/ Looking up "http://localhost:58474/simple/coherent-build/" in the cache Request header has "max_age" as 0, cache bypassed No cache entry available Starting new HTTP connection (1): localhost:58474 http://localhost:58474 "GET /simple/coherent-build/ HTTP/1.1" 404 335 Status code 404 not in (200, 203, 300, 301, 308) Could not fetch URL http://localhost:58474/simple/coherent-build/: 404 Client Error: File not found for url: http://localhost:58474/simple/coherent-build/ - skipping Skipping link: not a file: http://localhost:58474/simple/coherent-build/ Given no hashes to check 0 links for project 'coherent-build': discarding no candidates ERROR: Could not find a version that satisfies the requirement coherent.build (from versions: none) ERROR: No matching distribution found for coherent.build Exception information: Traceback (most recent call last): File "/Users/jaraco/.local/pip-run/pip/_vendor/resolvelib/resolvers.py", line 397, in resolve self._add_to_criteria(self.state.criteria, r, parent=None) File "/Users/jaraco/.local/pip-run/pip/_vendor/resolvelib/resolvers.py", line 174, in _add_to_criteria raise RequirementsConflicted(criterion) pip._vendor.resolvelib.resolvers.RequirementsConflicted: Requirements conflict: SpecifierRequirement('coherent.build') During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/Users/jaraco/.local/pip-run/pip/_internal/resolution/resolvelib/resolver.py", line 95, in resolve result = self._result = resolver.resolve( ^^^^^^^^^^^^^^^^^ File "/Users/jaraco/.local/pip-run/pip/_vendor/resolvelib/resolvers.py", line 546, in resolve state = resolution.resolve(requirements, max_rounds=max_rounds) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/.local/pip-run/pip/_vendor/resolvelib/resolvers.py", line 399, in resolve raise ResolutionImpossible(e.criterion.information) pip._vendor.resolvelib.resolvers.ResolutionImpossible: [RequirementInformation(requirement=SpecifierRequirement('coherent.build'), parent=None)] The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/Users/jaraco/.local/pip-run/pip/_internal/cli/base_command.py", line 179, in exc_logging_wrapper status = run_func(*args) ^^^^^^^^^^^^^^^ File "/Users/jaraco/.local/pip-run/pip/_internal/cli/req_command.py", line 67, in wrapper return func(self, options, args) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/.local/pip-run/pip/_internal/commands/install.py", line 377, in run requirement_set = resolver.resolve( ^^^^^^^^^^^^^^^^^ File "/Users/jaraco/.local/pip-run/pip/_internal/resolution/resolvelib/resolver.py", line 104, in resolve raise error from e pip._internal.exceptions.DistributionNotFound: No matching distribution found for coherent.build Removed build tracker: '/private/var/folders/f2/2plv6q2n7l932m2x004jlw340000gn/T/pip-build-tracker-ea8_g6bl' Command '['/Users/jaraco/draft/.venv/bin/python', '-m', 'pip', '-vvv', 'install', '--disable-pip-version-check', '--upgrade', '--only-binary', ':all:', '--index-url', 'http://localhost:58474/simple/', '--trusted-host', 'localhost', 'coherent.build']' returned non-zero exit status 1. Traceback (most recent call last): File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/__main__.py", line 156, in invoke_main main(auto_envvar_prefix="FROMAGER") File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/decorators.py", line 45, in new_func return f(get_current_context().obj, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/commands/bootstrap.py", line 63, in bootstrap sdist.handle_requirement(wkctx, Requirement(toplevel), req_type=origin) File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 157, in handle_requirement build_system_dependencies = _handle_build_system_requirements( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 340, in _handle_build_system_requirements _maybe_install(ctx, dep, "build-system", resolved) File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 506, in _maybe_install safe_install(ctx, req, req_type) File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 515, in safe_install external_commands.run( File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/external_commands.py", line 50, in run raise subprocess.CalledProcessError(completed.returncode, cmd, output) subprocess.CalledProcessError: Command '['/Users/jaraco/draft/.venv/bin/python', '-m', 'pip', '-vvv', 'install', '--disable-pip-version-check', '--upgrade', '--only-binary', ':all:', '--index-url', 'http://localhost:58474/simple/', '--trusted-host', 'localhost', 'coherent.build']' returned non-zero exit status 1. Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/__main__.py", line 163, in invoke_main() File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/__main__.py", line 156, in invoke_main main(auto_envvar_prefix="FROMAGER") File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/click/decorators.py", line 45, in new_func return f(get_current_context().obj, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/commands/bootstrap.py", line 63, in bootstrap sdist.handle_requirement(wkctx, Requirement(toplevel), req_type=origin) File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 157, in handle_requirement build_system_dependencies = _handle_build_system_requirements( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 340, in _handle_build_system_requirements _maybe_install(ctx, dep, "build-system", resolved) File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 506, in _maybe_install safe_install(ctx, req, req_type) File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/sdist.py", line 515, in safe_install external_commands.run( File "/Users/jaraco/draft/.venv/lib/python3.12/site-packages/fromager/external_commands.py", line 50, in run raise subprocess.CalledProcessError(completed.returncode, cmd, output) subprocess.CalledProcessError: Command '['/Users/jaraco/draft/.venv/bin/python', '-m', 'pip', '-vvv', 'install', '--disable-pip-version-check', '--upgrade', '--only-binary', ':all:', '--index-url', 'http://localhost:58474/simple/', '--trusted-host', 'localhost', 'coherent.build']' returned non-zero exit status 1. ```

I'm pretty sure the crux of the problem is illustrated in:

Could not fetch URL http://localhost:58474/simple/coherent-build/: 404 Client Error: File not found for url: http://localhost:58474/simple/coherent-build/ - skipping
Skipping link: not a file: http://localhost:58474/simple/coherent-build/
Given no hashes to check 0 links for project 'coherent-build': discarding no candidates
ERROR: Could not find a version that satisfies the requirement coherent.build (from versions: none)
ERROR: No matching distribution found for coherent.build

As no coherent.build exists to build coherent.build.

If that's not actually the case, I'd have to see the actual dependency cycle to know how I would address it.

I believe coherent.build is a pretty basic example. Other projects currently avoid the issue.

PEP 517 (I think? maybe another standard) implied that build backends were expected to avoid introducing cyclic dependencies.

If you can find this reference, it would be crucial to the discussion. Regardless, it's this assumption that we need to break. It essentially leads to "build backends can't have dependencies."

The flit_core test is one example of an approach to solving that problem

The only reason flit_core can do that is because it has no dependencies (build time or run time).

I find it hard to imagine any system integrator accepting that constraint as part of building a modern secure software delivery pipeline.

I really appreciate the engagement here. I haven't gotten much from other system integrators. Would you be interested in having a high bandwidth conversation about the issues involved?

jaraco commented 1 month ago

PEP 517 (I think? maybe another standard) implied that build backends were expected to avoid introducing cyclic dependencies.

If you can find this reference, it would be crucial to the discussion. Regardless, it's this assumption that we need to break. It essentially leads to "build backends can't have dependencies."

You're right. It's in PEP 517 here.

Project build requirements will define a directed graph of requirements (project A needs B to build, B needs C and D, etc.) This graph MUST NOT contain cycles. If (due to lack of co-ordination between projects, for example) a cycle is present, front ends MAY refuse to build the project.

jaraco commented 1 month ago

I find it hard to imagine any system integrator accepting that constraint as part of building a modern secure software delivery pipeline.

This is an important point, but let me dig a little deeper. Is it more secure for the system integrator to bundle the artifacts for bootstrapping or for each build backend to bundle their own artifacts for bootstrapping (as "sources")?

dhellmann commented 1 month ago

I tried using fromager to build coherent.build, since it provides the most straightforward cyclic dependency (on itself and its dependencies). The build failed thus:

I'm pretty sure the crux of the problem is illustrated in:

Could not fetch URL http://localhost:58474/simple/coherent-build/: 404 Client Error: File not found for url: http://localhost:58474/simple/coherent-build/ - skipping
Skipping link: not a file: http://localhost:58474/simple/coherent-build/
Given no hashes to check 0 links for project 'coherent-build': discarding no candidates
ERROR: Could not find a version that satisfies the requirement coherent.build (from versions: none)
ERROR: No matching distribution found for coherent.build

As no coherent.build exists to build coherent.build.

I think here you want to set the backend-path instead of requires to use an in-tree backend, right?

PEP 517 (I think? maybe another standard) implied that build backends were expected to avoid introducing cyclic dependencies.

If you can find this reference, it would be crucial to the discussion. Regardless, it's this assumption that we need to break. It essentially leads to "build backends can't have dependencies."

From PEP 517:

  • Project build requirements will define a directed graph of requirements (project A needs B to build, B needs C and D, etc.) This graph MUST NOT contain cycles. If (due to lack of co-ordination between projects, for example) a cycle is present, front ends MAY refuse to build the project.

The flit_core test is one example of an approach to solving that problem

The only reason flit_core can do that is because it has no dependencies (build time or run time).

Based on the 200-300 packages I've been working to build, plenty of other build-time requirements end up depending on flit-core, though. They just have to be careful to adopt things that don't introduce a cycle.

I find it hard to imagine any system integrator accepting that constraint as part of building a modern secure software delivery pipeline.

I really appreciate the engagement here. I haven't gotten much from other system integrators. Would you be interested in having a high bandwidth conversation about the issues involved?

Sure. The timing isn't great this week, because of work project deadlines, but maybe in a few weeks?

dhellmann commented 1 month ago

I find it hard to imagine any system integrator accepting that constraint as part of building a modern secure software delivery pipeline.

This is an important point, but let me dig a little deeper. Is it more secure for the system integrator to bundle the artifacts for bootstrapping or for each build backend to bundle their own artifacts for bootstrapping (as "sources")?

I can read source code to tell if it has a backdoor. I can't read pre-compiled binaries.

jaraco commented 1 month ago

Sure. The timing isn't great this week, because of work project deadlines, but maybe in a few weeks?

Sounds great. I sent an invite. We can coordinate over email. Looking forward to it.

jaraco commented 1 month ago

I can read source code to tell if it has a backdoor. I can't read pre-compiled binaries.

I get that, but what if the pre-compiled binaries are just pure-Python wheels? Ultimately, I'd like for build backends to be able to depend on any Python library, even one that might have compiled extensions, but in the current configuration, that's not needed and I don't want that to be a blocker preventing backends from having any dependencies.

The pre-built artifacts I'm speaking of are essentially the wheels that have been unpacked into the source code. If instead the system integrator were to keep a (small) set of trusted artifacts available for bootstrapping, it could circumvent the cycles. Those trusted artifacts themselves would have been built from source on a previous iteration of the system, so everything ultimately is derived from source; it's just not all built from source from scratch. These artifacts could potentially include compiled binaries as well, though that complicates matters (requires environment-specific variants, reduces inspectability).

I see this process as akin to how GCC is bootstrapped. Since GCC requires a C compiler, it can't be built purely from source, and requires some binary artifact to break the cycle. I'm guessing most other ecosystems don't run into the challenges that the Python ecosystem does because those ecosystems have a standard packaging tooling that's distributed with the language. Are there examples of other packages that suffer the bootstrapping problem (due to interdependencies of build tools) that may have solved it somehow?

dhellmann commented 1 month ago

I can read source code to tell if it has a backdoor. I can't read pre-compiled binaries.

I get that, but what if the pre-compiled binaries are just pure-Python wheels? Ultimately, I'd like for build backends to be able to depend on any Python library, even one that might have compiled extensions, but in the current configuration, that's not needed and I don't want that to be a blocker preventing backends from having any dependencies.

If they're pure python, why not install them from source?

The pre-built artifacts I'm speaking of are essentially the wheels that have been unpacked into the source code. If instead the system integrator were to keep a (small) set of trusted artifacts available for bootstrapping, it could circumvent the cycles. Those trusted artifacts themselves would have been built from source on a previous iteration of the system, so everything ultimately is derived from source; it's just not all built from source from scratch. These artifacts could potentially include compiled binaries as well, though that complicates matters (requires environment-specific variants, reduces inspectability).

The problem is coming up with a way to trust those artifacts in the first place. I can't really read bytecode any better than compiled C. If I don't start with something I can read, then I can't be sure the packaging tool isn't injecting bad things into the packages it's building.

I see this process as akin to how GCC is bootstrapped. Since GCC requires a C compiler, it can't be built purely from source, and requires some binary artifact to break the cycle. I'm guessing most other ecosystems don't run into the challenges that the Python ecosystem does because those ecosystems have a standard packaging tooling that's distributed with the language. Are there examples of other packages that suffer the bootstrapping problem (due to interdependencies of build tools) that may have solved it somehow?

I would expect all packaging tools to have aspect of this bootstrapping case, with the difference being that their dependencies might come in forms other than packages built by the tool itself. I don't know the history of rpm/yum/dnf or apt, but they're robust systems that must have dealt with this.

I see 2 ways to let build tools have dependencies:

  1. Do not allow cycles in the build tool dependency chain. The PEP accounts for this.
  2. Ensure at least 1 tool at the end of the build tool chain can be built with itself, or with some other non-PEP-517 approach, so that you minimize the special cases. flit-core appears to be the defacto answer for the self-build case. We're using pip from RPM for the other case.