256lights / zb

An experiment in hermetic, reproducible build systems
MIT License
183 stars 3 forks source link

During realization, consider realizations whose build inputs are not in the store #39

Open zombiezen opened 1 week ago

zombiezen commented 1 week ago

Consider a hypothetical (simplified) build of a Go program where P (the final program) depends on G (the Go compiler) depends on C (the C compiler). To build derivation P, we only need a realization for G: C's output does not need to be present unless G's output contains references to C's output. This is especially useful once substituters are implemented, because it means that trusted realization metadata can be downloaded (very cheaply) and then dependencies of a complex toolchain can be skipped if only the toolchain itself is needed.

I started mocking this up while writing 064a3395253439bdf3a273e1309dabc918cffce3. Basically:

  1. Walk the derivation graph only performing realizations and hashing derivations. During this stage, you permit realizations that don't exist in the store. (Downloading new realization data from substituters is fine, but we want to avoid downloading new store objects.)
  2. Keep walking until you run out of realizations.
  3. Now you should have a map of new realizations and a map of new hashes. If there is a path in the build graph from the desired outputs to a realization that doesn't exist in the store without crossing a realization that does exist in the store, remove any realizations from the map that depend on the realization that doesn't exist in the store. Also remove any realizations that don't exist in the store and the associated hashes for those derivations.
  4. Add any remaining realizations and hashes to the builder.
  5. Proceed with the realization as normal.

This effectively prunes portions of the dependency graph that have local dependencies available.