Open thomasahle opened 9 months ago
Hi Thomas, thanks for suggesting a contribution. I'm interested, but have a few questions about this topic and the notation. Mainly:
Oh I just saw what you wrote at the end. Correct, the notation used here is not standard in the tensor network field (meaning the one in quantum physics and in applied math) though papers in that field do fairly often introduce non-standard notations as long as they are clearly defined. Happy to discuss more.
However to actually derive the tensors using the chain rule, I think you need to show the function application as well, which is why I added them to my notation. If you know / can think of any better way to do this, I would consider it a great win!
I have a bunch more examples of derivations using this notation here: TensorDerivatives.pdf Though it is not so well documented at this point.
I see, interesting. Ok I'm convinced then that this material does fit with the site. Here are some requests about the writeup:
Lastly, you might like this recent article by some people in my field. I'm sure it's rediscovering some things in the more introductory part of the article, but by the end they pull off some impressive calculations. I think the notation there is related, but with thick lines representing plugging in continuous variables instead of lines with arrows at the end: https://journals.aps.org/prresearch/abstract/10.1103/PhysRevResearch.5.013156
Oh and, no there's not an article about this topic so you should start one from scratch. Feel free to make a new section of the site, though we could discuss how to organize it.
I'm open to however you want to make the figures. Usually I make mine in the Keynote presentation software, using a border size of 4 pixels for the shapes and lines, and then I just take screenshots to make the images. Primitive I know, but just thought I'd share that. I'm hoping in the future for some tensor diagramming software that will also generate high-quality images as output.
One challenge I'm having is what notation to use for function application. In the above diagram I used arrows along tensor dimensions (instead of simple edges), but sometimes you may want to take a function of a scalar. Like the division in softmax, softmax(x) = exp(x)/sum(exp(x)). This is causing me trouble, because a tensor graph, that represents a scalar, doesn't have any free edges. I could just put an arrow coming out of some arbitrary node, but that seems confusing. I could also put a circle around the graph, and have an arrow coming out of that. Any other ideas?
I would like to write an article about tensor derivatives, such as this derivative of the Hessian chain rule:
Is there already an article about this that I should contribute to? Or should I start one from scratch? Also, I'm not sure if my notation of using x -> f for function applications, f(x), and -- for tensor contractions A -- x is standard. If there's a better notation, I can switch it out.