Closed srush closed 3 years ago
Oh, I kind of liked it better this way, but if you want to change it back, I think you could write dU/dX_{seq(i),kern(j),seq*(k)} = \delta(k,i+j-1) and it would serve the same purpose as C.
Somewhat related, you don’t think it would be helpful to define \star for cross-correlation and \ast for convolution?
Like this:
This looks good to me. This captures the spirit of the delta contraction without while still allowing for direct indexing on the forward pass.
I noticed that the conv1d now uses the "index map" style with the \delta function. While this is super convenient for the derivative, I think it makes the standard function a bit harder to read. Is there a way that could be defined in the old-style and then the index map becomes exposed as part of a derivative rule?