actions / upload-artifact

MIT License
3.03k stars 686 forks source link

[feat req] Add sha256 of uploaded files as an output #454

Open pnacht opened 8 months ago

pnacht commented 8 months ago

What would you like to be added?

I noticed from the image in https://github.com/actions/upload-artifact/pull/448 that the Action in v4-beta logs the artifact hash.

It'd be great if the Action actually had that as an output that can be used in other steps.

Why is this needed?

This would be useful to confirm that the artifact hasn't been tampered with "in flight" between the moment it is uploaded and later downloaded with actions/download-artifact.

The risk scenario being:

jobs:
  upload:
    # ...
    steps:
      - # ...
      - run: ./create_artifact.sh
      - uses: actions/upload-artifact
      - uses: hacked-action/now-overwrites-the-artifact-with-something-malicious
  download:
    # ...
    steps:
      - # ...
      - uses: actions/download-artifact  # which is now malicious!
      - run: ./publish_package.sh

An example of this would be a release workflow that builds the release artifact and then runs tests on it before publishing. A malicious testing dependency could modify the artifact during testing.

The current mitigation to this risk is to manually calculate the hashes and then verify them later This is repetitive boilerplate that could be simplified if embedded into the Action itself:

If the action could output the sha256 of all files, this could be used to verify that the artifact hasn't been tampered with in the meantime:

jobs:
  upload:
    outputs:
      artifact-digest: steps.upload-artifact.artifact-digest
    # ...
    steps:
      - # ...
      - uses: actions/upload-artifact
        id: upload-artifact
      - # ...
  download:
    needs: [upload]
    # ...
    steps:
      - # ...
      - uses: actions/download-artifact  # fails because the malicious artifact's hash doesn't match!
        with:
          artifact-digest: ${{needs.upload.artifact-digest}}  # optional validation
      - # ...

In this example, I have actions/download-artifact as the entity responsible for validating the artifact hash. This would be the cleanest solution, in my opinion, since it means users don't need to understand what's going on under the hood. It also has the benefit of being guaranteed to have a single file (the artifact) to verify.

robherley commented 7 months ago

We're planning to expose the digest/hash on the public Artifact APIs as well, but having as an output is a great idea for the interim 👍

pnacht commented 7 months ago

Admittedly, my concern might have already been solved with the release of v4 and immutable artifacts!

If this means that GitHub can guarantee that the artifact I uploaded is the same artifact I'm now downloading, then I no longer need to compare the artifacts' hashes myself.

But having the artifact hash can be useful for other use-cases too, for sure.

robherley commented 7 months ago

@pnacht Keep in mind that artifacts can be deleted and recreated by name within a run (they will different IDs). Eventually having the digest exposed would be the best for integrity checks.

pnacht commented 7 months ago

Ah, gotcha. Might be a good idea to clarify that in the README... specifically "The contents of an Artifact are uploaded together into an immutable archive. They cannot be altered by subsequent jobs".

I took "immutable" and "cannot be altered" to include "cannot be deleted and replaced".