Unroll/pool operations for defining convolution/max-pooling

davidweichiang commented 3 years ago

Both Conv and MaxPool defined a U tensor that had to be redefined separately for 1d, 2d, 3d, etc. and got increasingly complicated with more dimensions. This tries out a different way that generalizes more easily (I think) to higher dimensions. However, I'm unsure whether this still captures the spirit of the original definition by @srush.

srush commented 3 years ago

I'm a bit mixed on this notation. I like that it scales to higher dimensions, but, I am not sure if adding abstraction is clarifying or not?

One test case is that, you could also add a stride argument and you could make pool a special case of unroll. Would that make it more clear or less clear?

Minor: I don't love the underscript and square brackets on the pool. Could you do that with an intermediary var?

Another small thing is that underscripts on pool are much lower than on max. I wonder if there is a way to keep a fixed baseline?

davidweichiang commented 3 years ago

Minor: I don't love the underscript and square brackets on the pool. Could you do that with an intermediary var?

Yes, I think it could be hidden by writing |pool|=k in another equation.

Another small thing is that underscripts on pool are much lower than on max. I wonder if there is a way to keep a fixed baseline?

It's because of the descender on the p. Unfortunately, if you put \struts in to line everything up, it looks much worse. The only solution I can think of is to choose a function name that doesn't have a descender in it!

davidweichiang commented 3 years ago

How does that spacing look?

srush commented 3 years ago

Sorry this still looks the same? Was it a different version?

davidweichiang commented 3 years ago

Oh, it probably didn't rebuild the pdf. I'll do that now.

davidweichiang commented 3 years ago

My feeling is that if unroll can be defined clearly, then the abstraction is a win because, in Conv2d, two unrolls aren't much harder to understand than one, whereas the 2D version of U might require the reader to re-understand more.

I also feel that unroll would be more clear if defined using slice notation, which we've avoided up to now:

unroll{seq(i)} = X{seq(i:i+|kernel|-1)} (assuming slices are inclusive at both ends)

Similarly

pool{seq(i)} = X{seq(ik:ik+k-1)}

srush commented 3 years ago

That's interesting, I like the slice here. Although I would have to think more about how it plays with other things like derivatives.

I guess my question was why not go all the way to:

unroll{seq(i)} = X{seq(i \times stride:i \times stride k+|kernel|-1)}

Which works for both conv and pool.

davidweichiang commented 3 years ago

Sorry for the mistake, I should have written the LHS as [unroll X]_{seq(i)}.

The only problem with a stride parameter is that it would increase clustter; unroll would have two axis names under it, a stride, and an argument tensor.

davidweichiang commented 3 years ago

I will create a separate issue for slices.

srush commented 3 years ago

Maybe I am missing something, but this notation is what is bugging me:

Any downside to just having it be two lines with an intermediate Y variable?

srush commented 3 years ago

Also Unroll takes X with no parens and pool takes no args. I think we should just be consistent and always write these as standard functions.

davidweichiang commented 3 years ago

I don't love the intermediate variables but agree that it's less complicated to read. The lack of argument to pool was a mistake, now corrected!

namedtensor / notation

Unroll/pool operations for defining convolution/max-pooling #53