polydawn / repeatr

Repeatr: Reproducible, hermetic Computation. Provision containers from Content-Addressable snapshots; run using familiar containers (e.g. runc); store outputs in Content-Addressable form too! JSON API; connect your own pipelines! (Or, use github.com/polydawn/stellar for pipelines!)
https://repeatr.io
Apache License 2.0
68 stars 5 forks source link

Sketch pipelines #67

Closed warpfork closed 8 years ago

warpfork commented 8 years ago

This is the first draft of mechanisms behind pipelining. Lots of the infrastructure in repeatr to date is about defining highly isolated pieces of work, then helping refine that definition of work until the results are effectively immutable. Now it's becoming time to start building in the other direction, as well: knitting pieces of work together, and making it possible to pump updates triggered by changing any one piece of data all the way through a series of transitions to produce something that's both new... and reproducible and auditable once we've charted the way.

To that end, this sketch introduces a new layer of configuration -- the Commission. Commissions are much like Formulas: they list inputs, actions to perform on them, and outputs to capture afterwards. The difference is that Commissions are allowed to refer to things by rough "names", which are human readable and mutable -- where Formulas refer to inputs by hashes like git commits, Commissions refer to their inputs by names comparable to git branches. Both layers are important: Formulas are completely repeatable descriptions of work because they continue to pin all inputs precisely; Commissions are less precise, but by producing Formulas from a Commission, we can get the best of both worlds.

The mapping from names to hashes is performed by another new structure called a Catalog (catalog.Book in the code). Catalogs list a series of names, and tell you which hash that name should resolve to. When you want to publish a new release of a product? Publish a new edition of the Catalog with that new hash. Commissions which consume that Catalog name will be automatically triggered to emit a new formula by...

... the Foreman! The Foreman is an actor upon a KnowledgeBase which contains a whole suite of related Commissions and Catalogs, and the Formulas they've produced and Wares they all reference. The Foreman listen for new Catalogs and Commissions, and evaluates them to produce Formulas... which then are scheduled to run on an Executor (this is the old familar turf, where we simply expect "formula in -> (hopefully deterministic) outputs out"). When the Executor returns results, the output wares may be fed back into releasing a new edition of a Catalog. This may continue to flow through a whole graph of dependent Commissions -- making it possible to update one ware and watch updated builds depending on it, and depending on things that depend on it, and so on... flow through the whole system automatically. :tada:

And many other miscellaneous bits:

Other features hinted at in the future but as yet deferred to later rounds of drafts:

There's no connection to the main() method yet -- no config, nothing -- this is still purely sketching, self-consistency testing, and a couple judicious but extremely visible duct-tape placeholders. But it is demoing multi-stage pipelines, automatically triggering evaluation between dependents in response to updates. So that's pretty cool. Enough to keep iterating on

timthelion commented 8 years ago

You know, everyone trashes on English for having crazy spelling rules, but I'd have to say that it's way more fun to abuse English spelling than in a phonetically spelled language like Czech :D. Still don't think it makes up for all those weekly spelling bees, though.

warpfork commented 8 years ago

Making fun of my opinionated, artisanal spelling of "evokation"? Hush, you!

timthelion commented 8 years ago

"artisanal spelling" :D

brb, I'm off to get my masters degree in mispronuciation.

warpfork commented 8 years ago

Merging to thunderous applause (cough) because I wanna get on with some refactors on master that'll make a real hash of this branch if it doesn't fold back in first. :)