mirage / ocaml-crunch

Convert a filesystem into a static OCaml module
Other
73 stars 21 forks source link

provide deterministic output by using a Map.Make(String) instead of a hashtable #51

Closed hannesm closed 4 years ago

hannesm commented 4 years ago

this is crucial for reproducible builds of unikernels that use crunch

hannesm commented 4 years ago

//cc @mirage/core any objections to this change (including the API change to avoid global mutable state)? It'd be great to get this released soon

samoht commented 4 years ago

LGTM

What about the date? Does it still make sense to include it in the generated file?

hannesm commented 4 years ago

the date embedded into a crunched module is either Unix.time () or SOURCE_DATE_EPOCH, which is fine with me (and the reproducible builds people https://reproducible-builds.org/specs/source-date-epoch/).

it is nice to have a last_modified (as required in our mirage-kv interface). another solution would be to use the modification time of individual files and directories (but then, reproducibility is depending on the mtime), thus the current approach is fine with me.

avsm commented 4 years ago

Definitely fine by me

hannesm commented 4 years ago

I checked the clients of ocaml-crunch (which are released to opam), and none of them use the API of crunch directly (this PR changes that API), but execute the command-line utility.

In addition, it is noteworthy that if you'd used the API before and wanted to crunch multiple separate artifacts, you'd get the wrong output after the first one, since the hash tables were not cleared between runs. This PR fixes this as well (by making the state explicit).