osl-incubator / scicookie

Cookiecutter template for a Python package.
https://osl-incubator.github.io/scicookie
Other
14 stars 18 forks source link

Feature request: generate SLSA provenance #275

Open agriyakhetarpal opened 6 months ago

agriyakhetarpal commented 6 months ago

Summary

Hi there! I wonder if scicookie as a cookiecutter template could generate SLSA3 provenance for Python-based build artifacts (the source distribution and wheels) in the template files by default (or allow opt-in for users if they want it at the time of the creation of the project using the template, while recommending users to do this by default).

This serves two reasons:

  1. Reproducibility of build artifacts with easily available metadata (provided with the release or on request by package authors, maintainers, or release managers)
  2. Enhanced security measures across multiple software distribution levels

More details are at the https://github.com/slsa-framework/slsa-github-generator/ repository and this blog: https://security.googleblog.com/2022/04/improving-software-supply-chain.html

https://github.com/sigstore/sigstore-python would also be a very viable method to sign artifacts.

Additional Information

Some instructions for creating a workflow that does this are available at: https://github.com/slsa-framework/slsa-github-generator/blob/main/internal/builders/generic/README.md#provenance-for-python

scicookie can make use of the makim release.ci command to do this – not sure how @semantic-release will fit in with this, though.

For sigstore-python, there is a workflow in the GitHub Actions marketplace: https://github.com/sigstore/gh-action-sigstore-python

Code of Conduct

xmnlab commented 6 months ago

thanks @agriyakhetarpal for open this issue!

TBH I am not familiar with SLSA3, so I will take some time to read all the references before I can be able to answer it,

PS: btw, additionally to semantic release we could have other release tools as well.

agriyakhetarpal commented 6 months ago

PS: btw, additionally to semantic release we could have other release tools as well.

Yes – a CalVer bot integration, or even something like https://github.com/callowayproject/bump-my-version would be quite helpful as well (which would allow running through pipx). Thanks for looking into this!

xmnlab commented 6 months ago

@agriyakhetarpal , I think that it looks really nice.

I don't think I can work on that right now .. or maybe it would take too much time until I would be able to work on that. All these things are new to me, although I had read (only) an article in the past about sigstore ... but I have no experience with that or anything about SLSA.

is that something you could work on? or maybe do you know someone that could work on that?

I will be more than happy to guide anyone in terms of the scicookie structure.

in the scicookie structure I think it should be defined by default as disabled, so any maintainer could decide if they want to opt-in that as default or not. I think that it would be very nice to add it by default to osl profile, just not sure how it would work with semantic release (https://github.com/semantic-release/semantic-release/issues/2913)

and of course, if you are a member of a community, we will be more than happy to have a profile for your community as well :) that is basically a yaml file with the configuration for the profile of your community, for example: https://github.com/osl-incubator/scicookie/pull/273/files

I hope that soon we will also have a flag for custom profiles locally as well.

let me know your thoughts about that, and thanks for raising this topic here.

xmnlab commented 6 months ago

@agriyakhetarpal , about the support for SLSA level 3 in semantic release, it seems it was not implemented yet.

agriyakhetarpal commented 6 months ago

@agriyakhetarpal , I think that it looks really nice.

I don't think I can work on that right now .. or maybe it would take too much time until I would be able to work on that. All these things are new to me, although I had read (only) an article in the past about sigstore ... but I have no experience with that or anything about SLSA.

about the support for SLSA level 3 in semantic release, it seems it was not implemented yet.

Thank you so much for your research, @xmnlab!

is that something you could work on? or maybe do you know someone that could work on that?

I am not sure if I will have enough bandwidth at this time to write an end-to-end solution that can cover all of what has been described – SLSA3 provenance is quite tricky, and while it is possible to implement a custom builder, there is indeed a lot of effort involved in doing so. However, for something like Sigstore, that should be doable, I can either write a PR myself, or step back and review a PR containing the implementation if someone else wishes to do it. On whether this should be disabled by default and that package authors should opt-in to doing this, I agree – not having it will simplify things.

and of course, if you are a member of a community, we will be more than happy to have a profile for your community as well :) that is basically a yaml file with the configuration for the profile of your community, for example: #273 (files)

Ah, I'm not sure I follow this fully – I have used SciCookie on a few personal/individual projects by now, but I am not using it as a part of a community so far. I am assuming this is meant for frequent users to have their configuration placed in-tree. However, this looks like a really handy feature!

I hope that soon we will also have a flag for custom profiles locally as well.

If you could standardise the schema for the YAML file, code editors and IDEs should be able to use that for syntax highlighting and error checking. It would also be great if a custom URL could be used (i.e., allowing the use of a profile from a raw GitHub URL hosted elsewhere, with something like scicookie --profile https://www.path.to/hosted/url/profile.yaml/).

agriyakhetarpal commented 6 months ago

See also, this is a new feature from GitHub: https://github.com/actions/attest-build-provenance

It it to be noted that this provides Level 2 (SLSA2) provenance, and not Level 3 (this is possible but it requires all workflows and actions-level dependencies to achieve that), but even a lower version would be a fair addition for adoption by users. I would be happy contribute towards adding Sigstore plus GitHub Actions attestations, if I can receive a go-ahead for this. I am not sure if similar features exist for other CI providers; however, based on #264, #265, #266 they are yet to be added to the template and thus can be explored at a later time.

xmnlab commented 6 months ago

@agriyakhetarpal that would be great! thanks!

I am just not sure yet how it would work ... so if you could share your plans here, I maybe could help you to define how it could be included into scicookie

agriyakhetarpal commented 1 month ago

Hi @xmnlab, I offer my apologies for such a late response! Your comment slipped through my notifications and I never realised you asked me for further plans – the GHA Attestations feature is now generally available at the time of writing for users and its integration should be smooth. Still, it works best with an action that publishes wheels to PyPI through OIDC-based trusted publishing, e.g., https://github.com/pypa/gh-action-pypi-publish/. This is so that attestations for wheels are built in a step before the upload and stored in https://github.com/myproject/myrepo/attestations/ for anyone to view and verify via the gh attestation verify command (during the job for wheel uploads). However, I don't see this action being used in release.yaml right now or elsewhere in the scicookie codebase, so we might have to either implement this for users who would like an alternative to semantic-release, or wait until semantic-release implements https://github.com/semantic-release/semantic-release/issues/2913 as you previously linked. See also: https://learn.scientific-python.org/development/guides/gha-pure/, which has a concrete example of how such a workflow could look like (minus the Jinja, of course).