pypa / build

A simple, correct Python build frontend
https://build.pypa.io
MIT License
697 stars 115 forks source link

Print artifact digest after build process is completed #789

Open elad-pticha opened 2 weeks ago

elad-pticha commented 2 weeks ago

I think it would be beneficial to add artifact digests to the Successfully built message. For instance:

Successfully built test-0.0.0.tar.gz@sha256:960b9adda66023aed657c0da9626a6b8de71e433843181a5397431465adb57a7 and test-0.0.0-py3-none-any.whl@sha256:c856716babc6d603769f6b4f1a7122a61b870e7f627b0e2b4aa8f48b712c7770

image

Including the digest information strengthens the link between the build system and the package repository.

Let's take, for example, a GitHub Actions pipeline that builds and pushes packages to PyPI. Users can check the package's hash in the package repository but have no idea if that artifact is really built from the source repository it claims to be related to. If a digest is printed at the end of the build process, it could indicate that the package in the package registry has been created in a specific build pipeline.

This also relates to the world of SLSA and artifact integrity, allowing users to link between an artifact to its source.

elad-pticha commented 2 weeks ago

If you think this is a good idea, I can submit a PR to implement it.

henryiii commented 2 weeks ago

I’d like to have a machine readable output (json probably), but it can’t (only) be printed, since it is problematic to try to pick it out of stdout. Probably an argument that allows writing the built package info to a stream or file. Having the name(s) and maybe other info is useful too when building pipelines.

henryiii commented 2 weeks ago

If you had such a machine readable output, then you could process it and produce the digest in your preferred format. No need to make a list of formats.

elad-pticha commented 2 weeks ago

I agree, but this still does not address my main concern.

When downloading packages from PyPI or other package managers, I can see the digests of the artifacts, but there is no way to verify that they are trustworthy and haven't been tampered with.

I believe the community would greatly benefit from being able to inspect the build workflow, specifically at the python -m build command, to ensure that the package in the registry was created by the claimed repository. Including this in a separate flag would depend on the project owner to incorporate it into their pipelines.

By printing the digest by default, it would enable users to establish a connection from the source to the build.

henryiii commented 2 weeks ago

This is being worked on in https://peps.python.org/pep-0740/. You can already sign artifacts and have GitHub host the attentions, see https://github.com/pypa/build/pull/782 and the examples linked from there (we have not made a release yet with our own attestations). Build logs are not a good place to put security information. They aren’t always visible and are deleted over time.

elad-pticha commented 1 week ago

I agree that build logs are not a place to put security information and that this does not come any close to creating a proper SLSA attestation, but maybe it is better than having no information at all for projects that don't produce attestations.

Another option is that this issue should be under the twine repository and that the digest should be printed after the package has been pushed.

WDYT?