Closed stmlange closed 1 month ago
@stmlange Can you kindly explain your scenario and why the verification is needed?
@qweeah Thanks for reaching out. The main reason why I have created this ticket is that there doesn't seem to be an (easy) option to verify/check if the locally pulled artifacts are "equal" to what the remove artifact is.
Try to answer the question: is what I have locally really what was published to remote?
Consider for example maven/gradle that publish dedicated sha1 and md5 hashsums so one can download the hashsum and verify somehow that the published thing is "correct"/"equal" to the local variant.
With oras one can have multiple digest variants encoded (https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests).
A digest can be sha256:6c3c624b58dbbcd3c0dd82b4c53f04194d1247c6eebdaab7c610cf7d66709b3b
or sha512:401b09eab3c013d4ca54922bb802bec8fd5318192b0a75f201d8b372742
...or whatever ORAS supports. Due to multiple supported hashalgorithms it is therefore not trivial to manually check if the downloaded artifact is actually what was published.
Running oras pull multiple times actually seems to re-download the artifact. So it may be possible that oras pull secretly checks those digests (and potentially fails if the download was not successfull), but redownloading is a waste of network resources.
Thanks @stmlange for the detailed explanation.
You can utilize oras manifest fetch
generate a checksum file and use shasum -c $FILE
to check it.
/cc @FeynmanZhou To validate if it's something we should add to ORAS CLI.
@stmlange Also worth mentioning that if you want to copy an artifact in a trusted way, why not using an OCI image layout?
# 1. copy an artifact to a local folder mcr.microsoft.com/oss/kubernetes/kubectl
> oras cp mcr.microsoft.com/oss/kubernetes/kubectl:v1.28.1 -r --to-oci-layout mcr.microsoft.com/oss/kubernetes/kubectl
✓ Copied application/vnd.docker.container.image.v1+json 1.93/1.93 kB 100.00% 571µs
└─ sha256:919d96c9446db8f5c6cf76d98abd4c79ccfe9af241f977d87188ef3e9f6f09de
...
Copied [registry] mcr.microsoft.com/oss/kubernetes/kubectl:v1.28.1 => [oci-layout] mcr.microsoft.com/oss/kubernetes/kubectl
Digest: sha256:a01b2873f41c65aa9157baf5ec0e0beaf80e9e84bb7dfa94b081cd230b534418
# 2. cp the OCI image layout folder to some air-gap environment
# 3. pull provenance file from the copied OCI image layout folder, checksum will be verified during the pull
> oras pull --oci-layout mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169
044ca9bfaf9665f54ad18bc -o in-toto
✓ Pulled provenance.json 14.9/14.9 kB 100.00% 717µs
└─ sha256:f4740e5a3adde42224679263c7b4e76985411cb7a9504615cf1421d8afb078b5
✓ Pulled application/vnd.oci.image.manifest.v1+json 682/682 B 100.00% 608µs
└─ sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc
Pulled [oci-layout] mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc
Digest: sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc
Indeed with ORAS the manual way would be to download the manifest (e.g. oras manifest fetch
).
The reason why I filed this issue is that I believe that you can not assume that you can check via shasum -a 256 -c $FILE
or a sha256sum
as this would assume the hashdigest of sha256
.
I believe as per https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests oras could also have a sha512:401b09eab3c013d4ca54922bb802bec8fd5318192b0a75f201d8b372742 or a hashdigest of sha512
.
I am not an expert of checksum file but shouldn't the length of the checksum string implies the algorithm already?
Yes I tested on my linux VM and different checksum can co-exist in the same checksum file
> cat a
123
> shasum -a 512 a >> sum
> shasum -a 256 a >> sum
> cat sum
ea2fe56bb8c1fb5ada84963b42ed71b764a74b092d75755173ade06f2f4aada9c00d6c302e185035cbe85fdff31698bca93e8661f0cbcef52cf2ff65864fd742 a
181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b a
> shasum -c sum
a: OK
a: OK
What I mean is although ORAS doesn't support sha512, still you may split the digest with :
and only keep the latter part as checksum, shasum
utility can auto detect the algorithm based on the length of the checksum string.
Yes in general the length of the hashed string could be used to determine the algorithm.
As https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests oras makes it even a bit easier as it encodes the used algo in front sha256:...
, sha512:...
.
The ORAS Digest (https://github.com/opencontainers/image-spec/blob/main/descriptor.md#digests) can be more than just sha256:...
, sha512:...
. How should I tell checksum
to verify a multihash+base58:...
or sha256+b64u:...
, or whatever other algos that are supported by ORAS?
How should I tell checksum to verify a multihash+base58:... or sha256+b64u:..., or whatever other algos that are supported by ORAS?
The demo I give generates sha256sum and sha512sum into on sum
file and shasum
is able to detect it automatically.
@stmlange You don't need to use shasum -a 256
, just shasum -c
is enough so the checking script won't involve the algorithm. (I have amended I earlier post and removed -a 256
from it)
The problem remains that ORAS can encode the hash as sha256+b64u:
in the manifest. There is no gurantue that everything that is encoded in the manifest is supported as hash by shasum
.
Consider the multihash+base58:...
or sha256+b64u:...
which can't be verified with shasum
easily.
Hence if we really need to go down the manual validation it would be a very tedious as one needs to do different things based on the digest used in the manifest.
Consider:
$ echo "123" > a
$ shasum -a 256 a >> sum
$ sha256sum a | cut -d ' ' -f 1 | xxd -r -p | base64 >> sum
$ sha256sum -c sum
a: OK
sha256sum: WARNING: 1 line is improperly formatted
multihash+base58:...
and sha256+b64u:...
are not registered in OCI spec and not supported (see a related test case of OCI digest library)
Ok I see that only sha256:...
and sha512:...
are actually registered and supported algorithms https://github.com/opencontainers/image-spec/blob/main/descriptor.md#registered-algorithms.
However I still think it is not that easy (I guess sometimes even impossible) to run a shasum
with just the manifest
.
Assume the example manifest from https://github.com/opencontainers/image-spec/blob/main/manifest.md#example-image-manifest.
It just tells us the "digest", but we don't know the filename. E.g. try
oras manifest fetch --pretty ..... | grep -o '"digest": "[^"]*' | grep -o '[^:]*$' | shasum -c --
For a shasum to work we need both filename and digest:
$ cat a
123
$ shasum -a 512 a >> sum
$ shasum -a 256 a >> sum
$ cat sum
ea2fe56bb8c1fb5ada84963b42ed71b764a74b092d75755173ade06f2f4aada9c00d6c302e185035cbe85fdff31698bca93e8661f0cbcef52cf2ff65864fd742 a
181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b a
in theory one could workaround the issue by attaching the filenames using annotation to the manifest like:
{
"mediaType": "application/vnd.oci.image.layer.v1.tar",
"size": 14189,
"digest": "sha256:181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b",
"annotations": {
"org.opencontainers.image.title": "blah.blah"
}
}
I still think and feel that manual validation is not the way to go :-)
Generating the checksum file is not easy with v1.1.0 but will be improved in v1.2.0. You can try the main build container, e.g. generate checksum file for mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc
> docker run ghcr.io/oras-project/oras:main manifest fetch mcr.microsoft.com/oss/kubernetes/kubectl@sha256:30019e253ab74eb3e38abae7b8997e8e60c420169044ca9bfaf9665f54ad18bc --format '{{range .content.layers}}{{if index .annotations "org.opencontainers.image.title"}}{{.digest}} {{index .annotations "org.opencontainers.image.title"}}{{println}}{{end}}{{end}}'
sha256:f4740e5a3adde42224679263c7b4e76985411cb7a9504615cf1421d8afb078b5 provenance.json
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
This issue was closed because it has been stalled for 30 days with no activity.
What is the version of your ORAS CLI
1.1.0
What would you like to be added?
Maybe I'm missing it, but assume I have a local copy of some pulled data. Is there an option to run some sha256check or something to verify that the local copy matches with the artifacts listed in the manifest?
Why is this needed for ORAS?
The manifest can have multiple digest encoding's which can make it very tricky to manually verify if the local copy is what is equal to the remote artifact.
Are you willing to submit PRs to contribute to this feature?