explorable-viz / fluid

Data-linked visualisations
http://f.luid.org
MIT License
32 stars 2 forks source link

Discrepancy between naive and direct implementations of fwd slicing #896

Closed rolyp closed 6 months ago

rolyp commented 7 months ago

Naive fwd agrees with direct fwd in all except graphics cases. This test runs fwd with the result of bwd, which for the graphics tests will be the empty selection; so actually it would seem that the two fwd implementations differ even on empty input (where the fwd slice should be empty). A quick investigation reveals (as suspected) that the issue is when inputs are copied straight through to the output, so I’m hoping we can relatively easily create a smaller test case to illustrate.

Done/dropped:

See also:

rolyp commented 7 months ago

One important observation is that trace-based and graph-based slicing will in general not agree on the environment part of the slice because of sharing. For example, in the prelude we have iterate, which uses on map. If we run the slicing/map test case, we compute a slice of map which refers to specific vertices which are also shared into the closure for iterate. I think they will always agree if we wind all the way back to a program slice, since (for trace bwd) that will have the effect of “sharing” slicing information from different closures. But if we consider an input where some subtrees are shared, i.e. environments, then graph-based slicing will identify usage on shared vertices and if we map these vertex selections back to subtree selections then this will effectively publish the usage information to all copies of those trees.

This probably serves some discussion in the paper, because actually much of the source of the efficiency of the graph-based approach is the fact that we don’t have to merge environments (which is exponentially expensive if treated naively in the trace-based approach).

Perhaps one could merge all the usage information associated with every function in γ to yield a γ’ $\geq$ γ and then assert that graph slicing produces γ’, but I’m not sure that’s possible to do unambiguously using only information about function names (i.e. without doing a full backward evaluation of the original program from which the environment was created).