lucasdicioccio / deptrack-project

monorepo for DepTrack
40 stars 5 forks source link

Plugin interface #4

Open moul opened 7 years ago

moul commented 7 years ago

Add a way to enrich the capabilities at runtime

Bonus: do not restrict the plugins to be developed in (100%) in Haskell

lucasdicioccio commented 7 years ago

Could you be more specific on what a plugin should represent and should be allowed to do?

(A) If a plugin encodes a protocol for doing check/turnup/turndown/reload action then it's pretty easy to have a "devops-compliant-binary" type in Haskell and let people just write the up/down/reload actions in the language they love (e.g., we could somewhat easily have a function to wrap /etc/rc.d/ like scripts). Then using the various way of building binaries it's possible to express building then calling a plugin in whatever language. In this interpretation, the plugin is a node but not a graph, it is allowed to exert power on the external world, it can have its own dependency system (as apt-get does when we install a package) but such a plugin cannot change the shape of the dependency graph. This is pretty easy to achieve and brings value in -- someone just needs to write the code, the main constraint is developer time.

(B) What would be more ambitious, and I'd like to grind my axe on this challenge. Is to have the external binary define a graph subset and add it to the current graph dynamically for a specific notion of "dynamically". Such a work would need to build upon what I have made for Delayed and Continued calls. The main caveat is that in order to infer what will be "Continued" we may need to perform IOs. In general, one needs to start walking the dependency graph to know what's in a single "Continued/Delayed" node. Such a thing breaks all static analyses/batching that can be done ahead of time (there's an instance of the the halting problem lurking here). One direction to relax such a limitation is to force all plugins to be ready before the graph is explored (i.e., at graph-construction time) or to have another action in the Op object, which lives in IO and can pull a subgraph. Both ideas would be the moral equivalent of changing the DevOp monad to be based in something that is not Identity and hence is non-deterministic (I'd rather avoid that). Such a development is an interpretation of plugins as subgraphs that the Haskell 'runtime' code pulls in the main graph, I already want to do something like this with Continued/Delayed ops but this requires some clever way of identifying individual nodes in a graph and making sure the subgraphs are deterministically rebuildable whilst built in a non-deterministic portion of the code (I'd call this fairly doable but dodgy).

(C) Another possible plugin interpretation is plugins as functions that manipulate the dependency graph on the fly. Maybe @abailly can weigh in with his past attempts at such workflow systems. IMHO the defining feature of DepTrack it to express a type-checked graph that can reasonably be inspected and debugged. We can indeed build alternative "defaultMain" (and I have experimental code that can historize the graph + expose some constrained web API) but I'm worried that when the plugin system becomes rich enough to describe arbitrary manipulations, then the plugin system becomes yet another programming language or some OSGI copycat.

In short: (A) definitely, it will happen naturally as I refactor stuff and find commonalities between say running Python, R, and Dotnet programs. (A) could even take the shape of a "YAML-plugin configuration". (B) is on my radar and is higher priority for me than (A) because I'd like to understand the design space where non-determinisms is "fine". (C) taken in isolation seems reasonable but I'd rather budget my time on (B) because it seems more promising.

abailly commented 7 years ago

Regarding (C) an advice drawn from my experience is: Don't do this! I have worked for some time on a system which was based on those principles: Nodes where functions that could produce other nodes and dependencies, leading to a graph it was impossible to reason about.

I think (A) is easy and as @lucasdicioccio said will arise quite naturally.

(B) is definitely interesting to explore but then it introduces side-effects and non-determinism in graph building which somehow gets us back to (C). This is what happens currently in my experiments with remote execution of a locally built executable: The graph is truncated at the Remoted node and the remote execution part is not available.

A possible solution would be to exposes those sub-graphs as usual DevOp based functions that would be expandable through side-effects but would still be restricted to some known type for later reuse and sharing. The sub-graph would be grafted in place as an edge (or hyper-edge) and not only a node thus providing more expressive power without comprimising too much of the type-safety.