c2pa-org / public-draft

Repository for the public drafts of the C2PA Specifications
Creative Commons Attribution 4.0 International
35 stars 1 forks source link

Replace the "exclusion" range #29

Closed hackerfactor closed 2 years ago

hackerfactor commented 2 years ago

Section 9.4.2 and 17.5.1 defines an exclusion range. This permits computing a checksum and storing it, without modifying the checksum based on the data being added. In addition, it defines a "pad" range. This way, you can compute the checksum and then add in the data. If the data changes size, then the pad can be used to cover an change in the data size.

I'm wondering why you don't use the same appoached used by the ICC Profile: Compute the checksum over the entire range and only (automatically exclude the bytes used for storing the checksum. This way, you don't need to explicitly define an exclusion range.

In addition, the padding can be part of the checksum storage. Fixed-length checksums (e.g., sha256) do not need padding. Variable-length signatures (e.g., x509) can have null bytes as padding at the end of the checksum. Alternately, the checksum value can start with a two-byte length and then include the checksum. In this case, the jumbf or cbor item has a length for the hash value that covers the two-byte encoded length, binary hash, and any padding (all in one field).

Having a large exclusion range, such as one that excludes the XMP and adjaced JUMBF (as mentioned in 17.5.1) permits an attacker to easily modify the entire JUMBF block or modify other content without any alterations.

lrosenthol commented 2 years ago

I'm wondering why you don't use the same appoached used by the ICC Profile:

If the only thing that would be excluded were the JUMBF block (aka C2PA Manifest Store), then you are correct, we could indeed take that approach. And, in the upcoming update to the draft specification, you will find a case where we do leverage that.

However, we have examples/use-cases where the claim generator wishes to exclude other areas of the asset that it doesn't want included in the hash - which is also why the ranges are an array.

lrosenthol commented 2 years ago

Duplicate of #37 - closing