theupdateframework / specification

The Update Framework specification
https://theupdateframework.github.io/specification/
Other
368 stars 54 forks source link

Clarify fetching targets with multiple hashes if one is missing #199

Open erickt opened 2 years ago

erickt commented 2 years ago

In https://theupdateframework.github.io/specification/v1.0.26/#fetch-target, when downloading targets with consistent snapshots enabled, it states:

... Otherwise, the filename is of the form HASH.FILENAME.EXT (e.g., c14aeb4ac9f4a8fc0d83d12482b9197452f6adf3eb710e3b1e2b79e8d14cb681.foobar.tar.gz), where HASH is one of the hashes of the targets file listed in the targets metadata file found earlier in step § 5.6 Update the targets role. In either case, the client MUST write the file to non-volatile storage as FILENAME.EXT.

Consider the case where we have a consistent snapshot repository, and a target foo with both sha-256 and sha-512 hashes listed. As written, it sounds like we we pick one of the entries, say the sha-256 and fetch $SHA256.foo, but it doesn't exist. As written, it sounds like we should give up, but $SHA512.foo might exist. Should we try to download it as well? go-tuf, for example, will try to download every hash prefixed version.

mnm678 commented 2 years ago

I think it makes sense to try all of the hashes before giving up, similar to the behavior of go-tuf.

It looks like the python-tuf ng client only uses the first hash, and the legacy python-tuf client uses only the last hash.

Regardless, we should update this text to make it more clear what the expected behavior is. What do others think the expected behavior should be?

trishankatdatadog commented 2 years ago

I think it makes sense to try all of the hashes before giving up, similar to the behavior of go-tuf.

Great question, and I agree to this answer. That also means clarifying how the server should write consistent targets. (Use one or all of the hashes? To me one is enough.)

lukpueh commented 2 years ago

Cross-referencing related general issue #198

lukpueh commented 2 years ago

Good question indeed! I somehow lean towards putting the onus on the server/repo to "duplicate" the files for all supported hashes (or rather just provide the corresponding redirects), so that the client can access the file using any of the hashes as prefix. This makes the client code simpler and potentially reduces client requests.