rotation equivariant MLP for 2d images

amoskalev commented 2 years ago

Hello Marc, thanks for your work!

Using your library, I want to implement a rotation equivariant MLP and apply it for roto-MNIST classification, where I rotate 2D digits and ravel them into a vector before feeding them into the network.

Can you give an advice of how to implement EMLP model for such a problem?

Would it be: model = nn.EMLP(repin,repout,G) with repin = 2828Scalar ; repout = 10*Scalar ; G=SO(2)?

Thanks.

honglu2875 commented 1 year ago

@amoskalev I don't think SO(2) has a well-defined action on images as pixel permutations. If you rotate a picture, some pixels will be lost (out of boundary) and it is no longer a group action. SO(2) does act on R^2 but this is a different story (which seems to be what the author's paper was addressing).

Also a question for the author @mfinzi, is there a reason why you only consider equivariant linear forms? Noether's finite generation of ring of invariants is telling us that if we want to get closer (in fact exact if we only consider algebraic functions) to universal approximators of equivariant functions, it is possible to also consider/compute higher order invariant functions if the group is not too wild (works for all groups considered in your paper). Computations for the full invariant polynomial basis might be impractical (exponential complexity), but maybe including some higher order ones helps?

mfinzi commented 1 year ago

Ah sorry guys, I missed this github issue. @amoskalev I believe it is possible to do this though a bit tricky for reasons that @honglu2875 brings up.

To avoid the boundary issue and have a well defined action, I think you would need to define the representation through bilinear interpolation with e.g. periodic boundary conditions (this way the pixels are not lost). As in \rho(g)v where v is the flattened image and the bilinearly interpolated rotation operation defines \rho(g). Bilinear interpolation does act linearly on the image, so you would just need to verify that with the appropriate boundary conditions the \rho(g) rotation operation is indeed invertible and composes as you would expect \rho(g)\rho(h) = \rho(gh) which I believe would be the case if constructed appropriately but might require some care. You would want to define the bilinear interpolation \rho implicitly using the LinearOperator class so as to save compute/memory. Maybe I can try to provide an example of how to do this, as I imagine it's something a number of people might want.

honglu2875 commented 1 year ago

yeah that's a solution for a lot of group actions on image grids. But maybe not for SO(2). If you can act SO(2) by permutation that means there is a map between SO(2) and S_{n*m}, but the latter is discrete and the former is a connected Lie group so the map has to be trivial.

mfinzi commented 1 year ago

And @honglu2875, in the framing of the paper the focus is on the linear equivariant maps, but in https://emlp.readthedocs.io/en/latest/notebooks/6multilinear_maps.html there are some examples with finding the space of multilinear equivariant maps, though as you mention doing so can quickly become computationally expensive for higher order functions. In my understanding though, the solution basis for equivariant multilinear maps e.g. the bilinear map V_1 \times V_2 \rightarrow V_3 (or written differently V_1 \rightarrow V_2 \rightarrow V_3) can be mapped over from the solution basis for the equivariant linear map (V_1 \otimes V_2) \rightarrow V_3.

In the EMLP network, we use a Bilinear layer so as to make it possible to form these higher order equivariant functions. If you could find an explicit example of an equivariant polynomial function that could not be expressed exactly in EMLP or a more general equivariant function that could not be approximated by EMLP, I would be interested.

mfinzi commented 1 year ago

And @honglu2875 , this wouldn't be a pure permutation representation so I'm not sure that logic applies. Rather because of bilinear interpolation, most rotations would not be permutations, only a select few like rotation by 90 degrees. Instead it's its own kind of unusual action on the image

honglu2875 commented 1 year ago

And @honglu2875 , this wouldn't be a pure permutation representation so I'm not sure that logic applies. Rather because of bilinear interpolation, most rotations would not be permutations, only a select few like rotation by 90 degrees. Instead it's its own kind of unusual action on the image

Ooooh I read too quickly. I saw periodic boundary conditions and thinking you just wrap pixels around. Yeah I agree some sort of interpolation seems to be able to introduce nontrivial representation.

honglu2875 commented 1 year ago

And @honglu2875, in the framing of the paper the focus is on the linear equivariant maps, but in https://emlp.readthedocs.io/en/latest/notebooks/6multilinear_maps.html there are some examples with finding the space of multilinear equivariant maps, though as you mention doing so can quickly become computationally expensive for higher order functions. In my understanding though, the solution basis for equivariant multilinear maps e.g. the bilinear map V_1 \times V_2 \rightarrow V_3 (or written differently V_1 \rightarrow V_2 \rightarrow V_3) can be mapped over from the solution basis for the equivariant linear map (V_1 \otimes V_2) \rightarrow V_3.

In the EMLP network, we use a Bilinear layer so as to make it possible to form these higher order equivariant functions. If you could find an explicit example of an equivariant polynomial function that could not be expressed exactly in EMLP or a more general equivariant function that could not be approximated by EMLP, I would be interested.

Oh this is very nice! Just to make sure I don't misunderstand, can I assume Bilinear layer is to go through invariant degree 2 polynomials? As we see, using multilinear basis is an expensive way to produce all generators of ring of invariants, but an area with a lot of open problems in invariant theory is to determine the "redundancy" (non-trivial algebraic relations). In other words, if there is an implementation of MLP using higher order invariant functions, even learning the function 0 is interesting (a set of learned coefficients corresponds to a relation). I had a discussion with a colleague working on invariant theory involving trinomial or higher degree functions and I need to check with him.

mfinzi / equivariant-MLP

rotation equivariant MLP for 2d images #19