bionode / bionode-watermill

💧Bionode-Watermill: A (Not Yet Streaming) Workflow Engine
https://bionode.gitbooks.io/bionode-watermill/content/
MIT License
37 stars 11 forks source link

Pipeline Graph (DAG) Visualization #50

Closed thejmazz closed 7 years ago

thejmazz commented 7 years ago

It is useful to have a visual representation of the Directed Acyclic Graph (DAG) that is produced during the execution of a pipeline.

In the graph,

The redux reducer for the DAG is here. It uses graph.js.

The graph exists in the store under the path collection (i.e. a valid selector would be (state) => state.collection.

A function jsonifyGraph is also exported. This is because the graph object from graph.js is not serializable. This creates a serializable JSON representation of the graph.

See here how the collection (aka DAG) is logged out during task resolution for debug.

A first implementation of this could be to write the JSON graph to disk during the pipeline execution, overwriting the previous file whenever a ADD_OUTPUT or ADD_JUNCTION_VERTEX actions have been dispatched (i.e. whenever the state of the DAG changes). This way if a task fails, at least we have the last best graph stored.

Then it is a matter of parsing that JSON into a visualization using something like d3.

Suggestions to improve the way the graph is handled within watermill are welcome. Perhaps there is a better serializable format to use (e.g. graphml format).

BONUS

thejmazz commented 7 years ago

See the current JSON representation here. Note the duplicated logging of nodes if they are children of other nodes. This JSON graph structure is probably not ideal, or at least, should be created from another structure of { nodes: [], edges: []}.

It could be useful to use a more standard graph format. The most ideal would be:

Some links:

Even OBO could work: this could have nodes for a file, which might have an edge "created_by" and "used_by" to different task nodes

tiagofilipe12 commented 7 years ago

We now have this kind of structure and a simple graph visualization with d3, available at localhost:8084 when watermill is running. Though, it still lacks operationString.

tiagofilipe12 commented 7 years ago

Shall we close this? Of course graph visualization can be further improved but for now we have a simple DAG visualization tool.

thejmazz commented 7 years ago

Yes lets close for now. We can always make an issue for more specific improvements.