python-wheel-build / fromager

Build your own wheels
https://fromager.readthedocs.io/en/latest/
Apache License 2.0
7 stars 11 forks source link

Support external dependencies (experimental PEP 725 support?) #463

Open pnasrat opened 1 month ago

pnasrat commented 1 month ago

First off thanks to everyone working on fromager. I'm happy to provide a PR but wanted to file an issue for discussion first.

It would be helpful for my use case, and I hope for others, if there was a standard mechanism in fromager for external dependencies. Currently fromager offers a hooks mechanism for customization

Problem

As per the example for pydantic-core many fromager failures result from missing external dependencies. Bootstrapping fromager with fromager for example the dependency tree needs rust, libffi-dev in the build environment

PEP-725 currently proposes a standard to use [external] metadata for Python build and runtime dependencies on non-python packages (eg libffi-dev) using PURL (package urls).

Examples of external metadata can be found in https://github.com/rgommers/external-deps-build, eg for cryptography

[external]
build-requires = [
  "virtual:compiler/c",
  "virtual:compiler/rust",
  "pkg:generic/pkg-config",
]
host-requires = [
  "pkg:generic/openssl",
  "pkg:generic/libffi",
]

Proposal

dhellmann commented 1 month ago

Hi, @pnasrat, thanks for starting this discussion!

Proposal

  • Add hooks for get_external_build_dependencies and get_external_host_dependencies (and handle optional-dependencies) with no-op defaults (given PEP 725 is just proposed). This would allow fromager users and distros to add a hook to install dependencies using the system package manager when running fromager in a container

When PEP-725 is approved, it definitely makes sense to support it in fromager.

Today we use pyproject_hooks for probing most of the dependencies, so I think we'd want to see support added there, too, rather than building our own version of that for this new type of dependency. Unless maybe there's no need because we could just read them from the pyproject.toml, like we do for the build backend?

And then, yes, overrides in fromager would make sense for projects where the standard hooks don't work and someone providing build overrides could add their own customizations.

  • Add support to configure write PEP 725 [external] table via the project_override in packagesettings.py which already modify the pyproject.toml. This could also be done via a new config hook slot that can customize project override if you don't wish for fromager to be PEP 725 aware at this time.

This part is a little less clear. I can see the utility of having the external dependency list, but I would expect that to mostly be done in the original source package build instructions, even if aspects of the build change the values that need to appear there. Maybe I'm missing something, though. Why does fromager need to update that list (or provide a hook for a build plugin to)?

pnasrat commented 1 month ago

That makes sense, I'll keep an eye on the progress of the PEP, initially suggesting the hooks was to allow for me to try to get builds working with pulling in external deps in the interim. I'll take a closer look at pyproject_hooks also (it's been a while since I've worked deeply in python packaging)

The second part is more as I assume it will take some time for upstreams to add their external deps, where as a package/distribution already has that info (or as from the external deps research metadata) .

I also see it useful for cases for from source builds where eg cmake might be being removed from the standard requires to prefer an external (distro/os) installed cmake (that could be installed in the build env).

For now I can handle the external dependencies outside of fromager for building from source using fromager as they aren't in any upstream projects currently.

dhellmann commented 1 month ago

The second part is more as I assume it will take some time for upstreams to add their external deps, where as a package/distribution already has that info (or as from the external deps research metadata) .

That makes sense.

I also see it useful for cases for from source builds where eg cmake might be being removed from the standard requires to prefer an external (distro/os) installed cmake (that could be installed in the build env).

For now I can handle the external dependencies outside of fromager for building from source using fromager as they aren't in any upstream projects currently.

We're doing exactly that in our downstream builds. So far we've just been tracking dependencies like that using a Containerfile. Pre-installing the build dependencies in that image saves time in the build environment. I can see how it would be useful to detect the runtime system library dependencies, though, when building a deployment image. @tiran has done some thinking about that, too.

tiran commented 1 month ago

I have created elfdeps and integrated it into Fromager. The tool extracts shared library dependencies from ELF files and emits RPM-compatible requirements and provides. The output can be used on Fedora/CentOS/RHEL-based systems, because RPM build system injects metadata into the RPM packages. I don't know if Alpine, Debian, Gentoo, and other distros have the same metadata in their packages.

The same approach could be used for macOS DWARF and Windows PE files. Contributions welcome! :)

I have a PoC container based on CentOS 9 Stream that contains Fromager, build tools, and dependencies for most Python packages. I'll push the container next week.