Using a different starting layout for the provenance graph

fedarko commented 5 years ago

Improvement Description This change would involve using a different Cytoscape.js layout method for the provenance graph—in particular, one that is aware of the graph's DAG structure.

This is a relatively minor issue, but I think it could improve the user experience with view.qiime2.org a decent amount (especially since changes to the provenance graph layout get reset on switching to the visualization/peek tabs, as described in #74).

Current Behavior The current layout used is the built-in grid layout, with precomputed (?) row and column positions for nodes in the graph.

https://github.com/qiime2/q2view/blob/fc207585edcbcde574171c278eb85df702c916b5/app/pages/Provenance/dag.jsx#L9-L18

This can lead to some mildly funky layouts that require manual fixing to see node/edge details—see e.g. the following screenshot, which I just took today:

Proposed Behavior Using a Cytoscape.js layout extension like Dagre (example, GitHub) or Klay (example, GitHub) to lay out the graph in a more appealing way. Something like the built-in breadthfirst layout (example) might also be usable.

It may also be worth adjusting the edges to be normal bezier curves, instead of using the current "segments" style: (this might be worth creating a separate issue, though)

https://github.com/qiime2/q2view/blob/d149b6dc18d3b45db927e22298d09ed5f99ac435/app/pages/Provenance/dag.jsx#L39-L46

Comments I tried to implement this in a clone of q2view a while back, but I couldn't figure out how to get Dagre to play nicely with React. (granted: I don't have a lot of experience with React.) I'm sure there's a way to get that working, though...

References

List of Cytoscape.js built-in layouts (includes grid, the currently-used layout)
List of Cytoscape.js layout extensions

Thanks!

ebolyen commented 5 years ago

Yep! This is definitely a crusty part of q2view. One of the things I struggled to get working well was the fact that we can have multiple nested nodes, which complicate the cross-over problem.

Another layout engine worth looking at is ELK: https://github.com/jfstephe/cytoscape.js-elk

But all of this is made a lot more complicated by the fact we actually have nested provenance that we can't visualize, so we may be beyond the abilities of cytoscape here as we'd want to be able to expand and collapse sub-dags without perturbing the layout too much.

fedarko commented 5 years ago

But all of this is made a lot more complicated by the fact we actually have nested provenance that we can't visualize, so we may be beyond the abilities of cytoscape here as we'd want to be able to expand and collapse sub-dags without perturbing the layout too much.

I'm a bit confused by this, sorry—I understand the "multiple nested nodes" thing (a single action produces multiple outputs, e.g. qiime dada2 denoise-single), but I'm not so sure I get what you mean by nested provenance or sub-dags. Do you mean something like the definition of sub-dags here, where a bunch of things at the same level can be collapsed/uncollapsed?

I've had to spend a decent amount of time messing with collapsing/uncollapsing compound nodes in a previous Cytoscape.js project—it was doable (at least for the applications I needed), but could be a pain. If I have some time in the future I would be happy to come back and help with this.

ebolyen commented 5 years ago

Not quite as linked, although that would be super handy!

Rather, each pipeline action, has within it the provenance of the sub-actions (recursively nested in the case of pipelines of pipelines). The way it works is actually super simple, each output of a pipeline gets a new UUID with an alias-of property which points to the "original" result.

The alias is what we see in q2view right now, and it lets us store the pipeline action (like core-metrics) which looks super normal. But if we were to follow the alias-of UUID we would have the action that was run within the pipeline and we could trace that backwards until we hit either an input to the pipeline, or an import. You can do this by hand right now, we just can't draw it.

So basically, you could "open up" a core-metrics-phylogenetic box, and see all of the alpha/beta/pcoa/emperor actions within it, ideally linked to some "ports" for the outer pipeline.

I've looked at joint.js in the past for trying to handle this, it has a nice port-system, but it didn't look quite as simple as the cytoscape was to integrate.

fedarko commented 5 years ago

Oh! I had no idea that was how pipelines worked in Q2, but that's actually super cool! Color me impressed :)

I think I understand what you mean now by nested provenance. I drew out a mock of this for the demo q2view ordination's align_to_tree_mafft_fasttree pipeline action, mostly to help myself understand it. Does this seem sort of like what you'd imagined?

nested_provenance_diagram

When the user would somehow "uncollapse" a pipeline action (here represented as a diamond), the stuff inside the corresponding action's diamond would get added to the provenance graph, essentially extending the path between the DADA2 sequences and the phylogeny. Ideally this'd be toggleable. (And yeah, you could have multiple layers of this if you have pipelines that call other pipelines like you mentioned.)

provenance_diagram_2

This is actually a super similar problem to the "structural pattern" collapsing I've been working on for MetagenomeScope, which will have multiple layers of collapsible/uncollapsible node groups in the future. We use Graphviz' "dot" layout program to precompute the layouts in advance there; that might be overkill for q2view, but it would be a way of ensuring that the provenance graph is drawn so that there's room to fully expand all of the pipeline actions.

I know you can recompute the layout in Cytoscape.js via something like cy.layout().run(), so if q2view ends up using elk/dagre/klay/etc. then one solution might just be redoing layout whenever the user collapses/uncollapses a pipeline... of course, that'd undo any customizations the user has previously made to the graph's layout, so that's not an ideal solution.

I'll mull this over for a while—this is a really neat idea, and though I can't commit bandwidth to it right now I think this would be a cool addition to q2view. Thank you for taking the time to discuss this with me!

ebolyen commented 5 years ago

You've got it! The only most minor of all details is that there are technically "two" rooted_trees in the provenance, the first recording the root-at-midpoint action, and the "alias" recording the pipeline itself (as it's own entity with inputs and params) and it's this second one that has the alias-of attribute pointing at the first, so we can see both actions. (The sequences input are indeed identical however, we only need this extra tracking on the outputs.)

Yeah it's precisely the problem that MetagenomScope has, where you want to "toggle" the level of abstraction.

I don't think precomputing will work for our case, but something a bit easier I have been thinking about is we could "full-screen" the sub-dags (where the ports are still visible, so you know it's part of the provenance in total. There could even be an easing animation to make it super obvious). If we were to do that, we wouldn't be able to ever see the "full" diagram, but it wouldn't need anything special for the layout engine (other than some new view state to keep track of which DAG we are drawing, and whatever changes have been made per DAG).

That said, the ELK layout engine does seem to know how to handle nested DAGs.

nbokulich commented 3 years ago

just rediscovered this issue... was about to post a separate issue about collapsing branches of a provenance graph, but this discussion seems to cover this idea.

Provenance graphs are getting more complex — large meta-analyses, more complex methodology, and some plugins (e.g., RESCRIPt) lead to more detailed, but also busier provenance. It would be awesome to manually collapse/expand branches of a provenance graph, e.g., by clicking on a node to collapse everything upstream.

Should I open a separate issue to track this, or does this fit into this issue?

qiime2 / q2view-old

Using a different starting layout for the provenance graph #104