python-wheel-build / fromager

Build your own wheels
https://fromager.readthedocs.io/en/latest/
Apache License 2.0
7 stars 11 forks source link

Add per package digest for sdist, plugin and env files #303

Closed shubhbapna closed 2 months ago

shubhbapna commented 2 months ago

We need a way to detect whether the sdist, plugin or env files have changed so that if we have enabled --skip-existing option then we not only take account of the version of the package but also whether the sdist, plugin or env for that package has changed

dhellmann commented 2 months ago

Who is responsible for tracking the digests?

Are the digests meant to be used to test for the need to rebuild during bootstrapping, too?

shubhbapna commented 2 months ago

Who is responsible for tracking the digests?

In terms of user experience, if fromager can handle it then it would be the smoothest. Getting the digest for sdist tarball shouldn't be as difficult. Similarly for the env files. For plugins since we are using a library to handle that we don't explicitly read the file, so thats something we will need to think about.

Are the digests meant to be used to test for the need to rebuild during bootstrapping, too?

I think for bootstrapping we build everything anyways. The skip option is only for build-sequence currently

dhellmann commented 2 months ago

Who is responsible for tracking the digests?

In terms of user experience, if fromager can handle it then it would be the smoothest. Getting the digest for sdist tarball shouldn't be as difficult. Similarly for the env files. For plugins since we are using a library to handle that we don't explicitly read the file, so thats something we will need to think about.

What other inputs does the digest calculation need? Maybe the override plugin code?

Are the digests meant to be used to test for the need to rebuild during bootstrapping, too?

I think for bootstrapping we build everything anyways. The skip option is only for build-sequence currently

Our bootstrapping test job downstream pre-populates the environment with wheel files for things we have on the wheel server to avoid rebuilding some of the more expensive packages. So this new behavior could override that and force a rebuild if the digest doesn't match.

shubhbapna commented 2 months ago

What other inputs does the digest calculation need? Maybe the override plugin code?

yep the override plugin code will be needed

shubhbapna commented 2 months ago

I think we can get the plugin source code by using inspect.getsource

dhellmann commented 2 months ago

Are there any settings that come from the containerfiles we use to build downstream? I'm thinking things like the CUDA version are probably relevant, and those are set there.

Maybe this is another thing that needs an override plugin. If we had a "collect_data_for_signature" hook (global, not per-package), it could take as input the requirement and version and stuff like build_wheel does and it could return data to go into a signature calculation (letting the caller do the calculation).

shubhbapna commented 2 months ago

Are there any settings that come from the containerfiles we use to build downstream? I'm thinking things like the CUDA version are probably relevant, and those are set there.

Tracking container file changes is the biggest problem because that is essentially asking fromager to know in which environment it is being run

Maybe this is another thing that needs an override plugin. If we had a "collect_data_for_signature" hook (global, not per-package), it could take as input the requirement and version and stuff like build_wheel does and it could return data to go into a signature calculation (letting the caller do the calculation).

I think fromager can reliably calculate the digest for sdists, overrides and env files so we wouldn't need a plugin from the user for those. Regarding the container files, we could define a hook or even do something simpler like using an env variable defined in the container file like "FROMAGER_BUILD_ENV_SIGN" ?

dhellmann commented 2 months ago

I think fromager can reliably calculate the digest for sdists, overrides and env files so we wouldn't need a plugin from the user for those. Regarding the container files, we could define a hook or even do something simpler like using an env variable defined in the container file like "FROMAGER_BUILD_ENV_SIGN" ?

What would go into the variable?

I was thinking the hook could define what is relevant to the builder environment, and just return bytes to be signed. That might be the image SHA, the containerfile contents, a bunch of environment variables, whatever. Fromager would feed it all into the signature calculation, without having to understand what it is.

shubhbapna commented 2 months ago

closing in favor of #316