Open asraa opened 1 year ago
Also relevant here is the context.
We currently record these fields https://github.com/slsa-framework/slsa-github-generator/blob/76f03fa7e30209f32ac76ce417ddc43ff98af42a/.github/actions/verify-token/src/predicate.ts#L158-L184 in SLSA v1.0 provenance (which will be reused for BYOB as well generically).
Currently, the human readable ACTOR
and other fields that may have PII are not noted. If we put the full context, we would have to ensure or allow an opt-out model for the included paramters.
My suggestion here is to add, from github context and event context, all "allowlisted" safe and relevant fields (e.g. ensure that base_ref
is added from event_payload
). I would rather this than an opt-out model. For e.g.:
owner
subcontext, (we do have `REPOSITORY_OWNER_ID already)The context also includes whether the ref was protected. IMO that is important but I remember we have had discussions on whether this information is "public".
cc @laurentsimon @kommendorkapten
Good point. There's a lot of value in recording those fields (actor_id, owner_id) as we explained in https://slsa.dev/blog/2022/06/slsa-github-workflows: it allows monitoring for changes for account / repo re-creation.
What if we could record opii = H(Nonce, _pii_)
where Nonce is a 128-bit nonce / secret that only the builder knows? It would provide privacy but allow for linkability between two attestations, which would allow monitoring for changes.
We could further scope the obfuscated version by repo, with opii = H(Nonce, repo_name, _pii_)
Sync'ed with @asraa today. The team agreed recording a non-human readable version field_id
is acceptable, so let's ignore my proposal!
I've added the GH event payload here: https://github.com/slsa-framework/slsa-github-generator/pull/1611
I took a look at the reasoning in slsa-verifier for the necessary fields here.
inputs
(workflow dispatch)base_ref
on push
eventstarget_committish
for release
eventsWe will never need the event.repository
context (the necessary content is elsewhere, and this leaks owner info), so I think we can scrub that.
https://github.com/slsa-framework/slsa-github-generator/issues/1575#issuecomment-1409547881
For the rest of the work of masking sensitive info, see here
Describe the bug Currently the SLSA provenance only allows for string-string ParameterValues. The GitHub event payload is a JSON object.
While we may be able to flatten it, it's inconvenient.
Add if the SLSA v1.0 updates to allow objects or more complex JSON types.