explorable-viz / fluid

Data-linked visualisations
http://f.luid.org
MIT License
36 stars 3 forks source link

Matrix convolution #272

Closed rolyp closed 3 years ago

rolyp commented 4 years ago

Another nice example that might be a good precursor to #271 and driver for #121 and #268 is calculating the convolution of an image by a kernel. (Also, see this distill.pub article on computing receptive fields in CNNs. Feels a bit like Galois slicing.)

@min-nguyen Ok, everything’s finally in a state where we can come back to this. It should be relatively straightforward now to turn fluid/experment/convolution.fld into working code. Some steps:

We can come back to the idea of inspecting values in the environment later. The prettyprinter is in reasonable shape now; it doesn’t print parentheses in the right place, but I suggest we don’t worry too much about that now.

rolyp commented 4 years ago

@tpetricek @jamescheney Something to discuss next week..

What I propose is extending our formalism and implementation so that we can use matrix convolution as an example and driver for a new feature called expression provenance. This is an idea from the data provenance literature which we haven’t worked out in the Galois setting.

The need for expression provenance is illustrated in the following figure. The user moves their mouse over some part of the right-hand chart, and relevant parts of the left-hand chart are identified. (Extensional) linking reveals that there is a relationship between the value on the right and the values on the left, but leaves the nature of that relationship obscure:

linking-1

In this case the relationship is “summation”. An important aside is that in general, the relationship isn’t direct like this, but rather mediated through a third thing - for example the value highlighted in one chart might be the sum of some numbers whose mean is highlighted in another chart, and those numbers which are summed in one chart and averaged in the other do not themselves have a graphical representation in either of the two charts. But even in this more general case, the relationship between the two charts can still be understood in terms of two bits of “expression provenance”, one read forwards, one backwards.

That’s the background motivation for expression provenance. Now back to convolution. Convolution of a matrix with a kernel/filter is a nice example for several reasons:

Hopefully the next picture will elucidate a bit. The “filter” below specifies how to compute each element of the output as a weighted sum of the values of neighbouring input cells. (The particular filter here is a Laplace filter, often used in edge detection.)

convolution-2

The idea is that our system should be able to “explain” how the -3 highlighted in gray (with the red box) is computed. The zeros in the filter specify that the NE, SE, SW and NW neighbours are disregarded; the N, S, E and W neighbours are included with weight 1, and the cell itself is included with weight -4. So the (eventual) user experience would be:

a) user hovers over -3 in the output; b) “needed” input cells are highlighted in grey, as shown above; c) the system can somehow visually convey that the expression “-4 + 1” (simplified using identities like 1 x = x and 0 x = 0) was evaluated to produce -3, perhaps by referring to cells graphically (as in Excel).

There are some nice things that slicing can reveal. For example, if the input matrix is considered to “wrap” at the boundaries, then this is visible from the consumed part of the input:

convolution

Alternatively, if the input matrix is not considered to wrap (and, say, default values are supplied for neighbours that lie outside the boundary), then parts of the filter will be disregarded:

convolution-3

In other words, the user can observe some fairly important aspects of the convolution algorithm without even having to look at the code, and with expression provenance as well, should be able to see just enough of the logic to understand fully what’s going on. Note that the two boundary behaviours just described are standard options in common convolution libraries.

Concretely, the steps would include:

1) Extending the core language and implementation with a native array/2D matrix type. We could certainly use lists of lists, but for various reasons a native array would be better. James worked out slicing rules for arrays for our ICFP 2017 paper, so it should just be a question of adapting those to our setting.

2) Formalising and implementing “expression provenance”, which is essentially a function from an evaluation trace to a semantically equivalent expression that disregards information about conditionals and higher-order functions. Basically, you throw away control flow and end up with an expression that you could evaluate to produce the same result (in this case the number -3). To make these readable, simplify using algebraic properties of the domain. Whatever metatheory is required to relate this to Galois slicing.

3) Extending the visualisation library with a native table/matrix view. Again, for performance and visual elegance a native implementation would be much preferable to implementing tables using our existing graphics primitives. The table view will provide the cell highlighting shown above.

4) Implementing the convolution code in our language, with a couple of options about how to deal with boundaries (e.g. wrapping vs. default values). The Python convolve library (https://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.ndimage.filters.convolve.html) would be a good reference point.

rolyp commented 3 years ago

@min-nguyen I’ve looked a bit more at the convolution examples. I suggest slowing down a bit and getting one example working first.

rolyp commented 3 years ago

Done! (Little bit later than planned, but it actually works..)