c2pa-org / public-draft

Repository for the public drafts of the C2PA Specifications
Creative Commons Attribution 4.0 International
35 stars 1 forks source link

Request to change exclusion to inclusion list #37

Closed hackerfactor closed 2 years ago

hackerfactor commented 2 years ago

Section 17.6 defines an optional exclusion array. The concept appears to be because some portions of the JUMBF block may change during generation (e.g., when a signature is generated).

Rather than using an optional exclusion array that contains a variable number of elements, please consider using a required "inclusion" array that identifies each byte offset and length to include in the hash generation.

This dramatically simplifies processing since (1) it becomes a required element, and (2) there is no implicit assumption that the hash covers the entire file "except for the following".

This change permits more flexible hash definitions. For example, one JUMBF block may include a hash with inclusion from 0 to 12345 (where offset 12345 is right before the jumbf block). The addition of any future JUMBF block no longer needs to include a hash that covers bytes 0 to 12345 since it is already covered by a previous block.

Moreover, it also allow the addition of new jumbf blocks at the end of the file. For example, if a JPEG's stream (ffda) begins at offset 123456 and ends at 333333, then the initial jumbf may have a hash that covers 0 to 123455 and a second hash that covers 123456-333333. A later edit may append another jumbf block inserted at 123456 and that covers the new stream location. In effect, each inclusion range permits future alterations to be recorded without necessarily disturbing all of the previous hashes.

lrosenthol commented 2 years ago

The original design of C2PA used inclusion lists. We changed due to a series of attack vectors that were identified with inclusions that are prevented by exclusions, specifically those in which data could be added to the file that would change the visual display without impacting the validation process.

hackerfactor commented 2 years ago

For that situation (added content that can change the visual display): It would be clear to the provenance information that the additional information was not verifiable/authoritative.

lrosenthol commented 2 years ago

It wouldn't though, @hackerfactor , since the hashes would match (when using an inclusive range). And since there is no "file size" information, the validator has no way of knowing if the extra data beyond the range is incorrect or wasn't hashed for some valid reason.

lrosenthol commented 2 years ago

Closing this since it will not be changed.