in-toto / witness

Witness is a pluggable framework for software supply chain risk management. It automates, normalizes, and verifies software artifact provenance.
https://witness.dev
Apache License 2.0
415 stars 59 forks source link

[Feat]: Add support for dirhash #446

Open matglas opened 6 months ago

matglas commented 6 months ago

Describe the solution you'd like:

Add support for creating dirhash for subjects/products as described in the in-toto attestation spec for the digestset (here).

User value:

This functionality would allow us to collapse folders into something that is easier to consume like 1k small files on nodes modules. These are not valuable as separate subjects one by one. But as a whole they could be valuable.

We have had a scenario where because of the large amount of small files we generated a 2gb files capturing a hash for each one. This could have been minimized a lot by using dirhash's.

Expected behavior:

Instead of a hash for each files allow a dirhash to be captured (following spec). Based on a glob pattern certain paths could be captured as dirhash and all other stuff could be file hashes.

Proposed solution:

Use a argument to specify which path or paths (glob based) should be hashed.

Anything else you would like to add:

I already have the implementation for this.

Testing changes required:

Tests should be added to test dirhash functionality.

Documentation changes required:

Apart from the cli argument no documentation is required additionally unless we want to create a tutorial with examples. For example for node_modules.

matglas commented 6 months ago

You can assign me.