polydawn / repeatr

Repeatr: Reproducible, hermetic Computation. Provision containers from Content-Addressable snapshots; run using familiar containers (e.g. runc); store outputs in Content-Addressable form too! JSON API; connect your own pipelines! (Or, use github.com/polydawn/stellar for pipelines!)
https://repeatr.io
Apache License 2.0
68 stars 5 forks source link

built-in Memoization support #110

Closed warpfork closed 6 years ago

warpfork commented 6 years ago

Memoization support! Enable it by setting the REPEATR_MEMODIR env var.

You can see this in action by running the examples repeatedly, then doing it again with REPEATR_MEMODIR=/tmp/memo ./example_runAll.sh -- the second time you run with the memo dir set, you'll see very different outcomes. The same runrecords will come out, but the user content (especially this is obvious from the one that invokes 'ls -la' for example) will be much less, because the real processes aren't re-run.

The default is not to memoize; you must opt in to it by setting the env var. This is because full memoization may not be what a user expects by default when doing repeatr run in an interactive terminal, especially if the user is using a formula with the intention of generating side-effects (like the 'ls -la' example, again, say).

It's expected that most "planner" tools (see ecosystem diagram) that generate formulas and trigger swaths of repeatr-run invokations will almost certainly set the REPEATR_MEMODIR var... so, though we make it opt-in, we also expect that in the big picture users will rarely be bothered about it since there's another layer of tool that can make this disappear.