Open ludwigschubert opened 6 years ago
Hello @ludwigschubert , sorry for the late response. The team is pretty occupied around these weeks. Indeed there is a distribution over weights of the neural net. To differentiate Bayesian from regular NN, the diagram ideally would show the distribution, but we have found it is slow to repaint/update the distribution frequently. The performance issue is also tied with the framework we are using. ConvNetJS is old and we had to do lots of modding to make it work. We'd love to have a more modern framework with things like autodiff, user-defined loss function and gpu acceleration. If you could recommend one it'd be great. We have had some ideas on how to improve this diagram that had to be shelved due to performance constraints, such as adding little distributions to the NN schema figure. Currently colour of the connection is variation and thickness is mean, but it's confusing to look at, we'll try to make it look like figure 1 of https://arxiv.org/pdf/1505.05424.pdf.
No worries ab out response timeframes—we're all trying our best.
In terms of JavaScript ML frameworks we've had some success with tensorflow.js.
Performance in web technologies can be tricky to get right. See if a different ML framework helps, and also feel free to reach out to us with a non-performant diagram in a branch. I can't promise we can help, but in the past we sometimes could. :-)
@ludwigschubert We've been thinking about your helpful suggestions. We are going to try switching to tensorflow.js, shrinking the net, and double-encoding the weights. One strategy we were thinking of to represent uncertainty over each weight was blurring each line in proportion to the uncertainty.
We are also thinking of comparing against standard NNs, as you suggested, by having two panes: The left pane would show a standard point estimate of a neural network with a single function through data. The right pane would should the BNN posterior. Do yo think it makes sense to use half the space on the baseline method?
One point I was hoping you could clarify: you said "Can we think of them as ten different networks? If so, show them as small multiples." Yes, we can can think of each sample from the posterior as a different network. What did you mean by "small multiples"?
If the comparison is important, feel free to use half the space for it! If focussing on the BNN posterior is also important, break it up into two diagrams. I always believe in introducing a concept/visual first, and then using it in more complex arrangements.
Small multiples simply means an aligned row or grid of similar visualizations to allow comparison. Here's an example:
In case of the NN weights, the samples from the posterior seemed like a natural choice to show—much easier to show individual points then a whole distribution.
I really like your hero diagram concept! Here are some ideas for making it easier to understand what's going on:
NN weights
stroke-dashoffset
.stroke-width
andopacity
to help differentiate between weights. You can also experiment with using an additional color scaleCombine NN diagram and plot
Distill's current article layout (which we'll help you implement as a separate step) will allow you to bring your title above the distributions and combine the NN diagram with the distributions and some explanatory text.
We are happy to send some concrete design proposals, but I am currently unsure if there are more parameters that could be interesting to reveal to readers in such a combined hero diagram. Those parameters could be tweakable aspects of the optimization process, statistics over the optimization, allowing users to set some weights manually and see observe the outcome… we rely on your insight here to decide what may help the story. As a starting point: you mention that Bayesian NN "tell us how uncertain our prediction are"—is there a way to show that uncertainty in the hero diagram?
Another framing: help me immediately see at least some difference between Bayesian NN and vanilla FC NNs in your diagram.