fugue-project / adagio

A Dag IO framework
Apache License 2.0
11 stars 3 forks source link

[FEATURE] Plotting task graphs #29

Open rdmolony opened 2 years ago

rdmolony commented 2 years ago

Is your feature request related to a problem? Please describe. I find task execution graphs really useful in communicating workflows to non-technical audiences, specifically they are interested in exactly what is being performed on the data to get it from A to B. However, this can get complex fast for distributed workflows!

dask: image

prefect: image

The dask graph is really hard to follow!

Describe the solution you'd like The dask graph is for the nerds, I'd like a higher-level prefect style graph instead for non-tech audiences and documentation. prefect creates uses graphviz to create nodes for each task in its task graph and then plot them (I'm slowly learning @kvnkho :) - https://github.com/PrefectHQ/prefect/blob/5de58efaba956b431335d99acab07eaf6a362e1b/src/prefect/core/flow.py#L1290) which I suspect could be adapted here

Describe alternatives you've considered It's already possible to leverage the dask graph by writing a dask function and using the dask execution engine, and so perhaps there is no need to do this natively!

rdmolony commented 2 years ago

Vision: Interactive vector based task graph Easiest: graphviz - pixel based

Resources:

FugueWorkflow.visualize() has to happen on local

rdmolony commented 1 year ago

Apologies team, I haven't made time to work on this.

I spotted Mermaidjs which reminded me of this PR, though I imagine other Python libs have since worked out a non graphviz solution since https://github.com/mermaid-js/mermaid

Other: