slsa-framework / slsa-github-generator

Language-agnostic SLSA provenance generation for Github Actions
Apache License 2.0
429 stars 129 forks source link

[bug] Standard discrepancy: `buildInvocationId` versus `buildInvocationID` #3876

Open woodruffw opened 1 month ago

woodruffw commented 1 month ago

My colleague @facutuesca observed this bug with the generator_generic_slsa3.yml action.

Describe the bug

In SLSA 0.1 and 0.2, buildInvocationId is spelled with a lowercase "d":

Screenshot 2024-09-10 at 4 55 22 PM

Similarly, it's spelled with a lowercase "d" in 1.0, where it's renamed to invocationId:

Screenshot 2024-09-10 at 4 56 24 PM

However, generator_generic_slsa3.yml@2.0.0 appears to generate 0.2 provenance objects with buildInvocationID (capital 'D') instead.

An example of this can be seen in sigstore-python's release artifacts, e.g. our intoto provenance for v3.2.0:

https://github.com/sigstore/sigstore-python/releases/download/v3.2.0/provenance-sigstore-v3.2.0.intoto.jsonl

when the payload is decoded, we can see that it's a v0.2 Provenance with the mis-spelled metadata.buildInvocationID. Excerpted below:

"metadata": {
    "buildInvocationID": "10457864437-1",
    "completeness": {
        "parameters": true,
        "environment": false,
        "materials": false
    },
    "reproducible": false
}

I've also attached the full SLSA provenance as a file to this report: slsa.json

To Reproduce

To reproduce, use the latest version of generator_generic_slsa3.yml (2.0.0) in a workflow, like so:

  generate-provenance:
    needs: [build]
    name: Generate build provenance
    permissions:
      actions: read # To read the workflow path.
      id-token: write # To sign the provenance.
      contents: write # To add assets to a release.
    # Currently this action needs to be referred by tag. More details at:
    # https://github.com/slsa-framework/slsa-github-generator#verification-of-provenance
    uses: slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@v2.0.0
    with:
      provenance-name: provenance-sigstore-${{ github.event.release.tag_name }}.intoto.jsonl
      base64-subjects: "${{ needs.build.outputs.hashes }}"
      upload-assets: true

(Not all of these options may be necessary; that's exactly how they appear in sigstore-python's CI, which observed this behavior.)

Expected behavior

I expected buildInvocationID to be spelled as buildInvocationId, for consistency with the SLSA provenance spec.

Additional context

None!

ramonpetgrave64 commented 1 month ago

Thanks for reporting this. It seems like a go-style initialism that we've mistakenly enforced. I'm a bit surprised we're not seeing compatibility errors between the generators and slsa-verifier.

I'd like to address this soon, but is it currently causing difficulties for you?

woodruffw commented 1 month ago

I'd like to address this soon, but is it currently causing difficulties for you?

Not difficulties per se, but we noticed it while adding SLSA predicate handling to sigstore-python: https://github.com/sigstore/sigstore-python/pull/1115 🙂

In particular, we currently have a manual alias for the non-compliant field name here:

https://github.com/sigstore/sigstore-python/pull/1115/files#diff-f1ebf27728ce90626c6d7d75f863a09ca8095663033d3146912b814f0716c593R128

woodruffw commented 1 month ago

(We'd ideally ship SLSA predicate support in sigstore-python without that hack, but I'm not sure if that's a good idea or not -- I suppose it would depend on how widely used this action is, and whether others are similarly assuming that this field is spelled as ID instead of Id).

ramonpetgrave64 commented 1 month ago

Lots of folks are using older versions of the generic generator, so it could be wiser for you to maintain backwards compatibility for preexisting provenances.

But first, is the new functionality you're adding to sigstore-python to be able to both generate and verify signed Sigstore bundles that have SLSA provenance statements (what I think you call DSSE statements)?

Currently the generic generator does not produce Sigstore bundles, only signed DSSE envelopes that wrap the provenances statements, which still use fulcio certificates for signing. slsa-verifier would do online lookups for more verification procedures.

We have a pending change to make the generic generator produce sigstore bundles, so I perhaps I should also fix the this initialism issue before we release that change.

If sigstore-python's verification is only meant to work with sigstore-bundles, and the generic generator does not yet produce sigstore bundles, do you still need the alias?

woodruffw commented 1 month ago

We have a pending change to make the generic generator produce sigstore bundles, so I perhaps I should also fix the this initialism issue before we release that change.

That would be great!

If sigstore-python's verification is only meant to work with sigstore-bundles, and the generic generator does not yet produce sigstore bundles, do you still need the alias?

Ah, true -- I think we observed this while looking for samples to check our the inner SLSA payload validation against, but we indeed won't ever directly consume the outputs of this action. So yeah, unless I'm missing something, we don't actually need the alias.

(CCing @facutuesca to confirm)

facutuesca commented 1 month ago

So yeah, unless I'm missing something, we don't actually need the alias.

Is there any workflow where a user would want to generate a sigstore bundle using the predicate inside the DSSE envelope generated by this action?

The verification where we found this issue is not the usual bundle verification, but rather a validation we run when generating sigstore bundles with DSSE envelopes. For that, the user needs to specify an artifact and a predicate:

sigstore attest --predicate-type https://slsa.dev/provenance/v0.2 --predicate ./path/to/predicate.json  FILE

Then sigstore validates that predicate.json has the expected structure of a v0.2 provenance, before actually creating the bundle.

So if a user trying to generate a bundle with this action's output's predicate is something we don't expect to happen, then yeah, we don't need the alias in sigstore-python

woodruffw commented 1 month ago

Is there any workflow where a user would want to generate a sigstore bundle using the predicate inside the DSSE envelope generated by this action?

I suspect not -- people using this action are probably passing the output around as-is, not reconstituting it into another bundle format 🙂

So yeah, it sounds like we should drop our alias.