Closed dqii closed 6 years ago
Not explicitly, but it is very easy to do. Let's say you have a stack of feature maps X of shape Nx4CxHxW, where C is the number G-channels, and each G-channel has 4 planar orientation channels (change 4 to 8 if you include reflections). You can reshape X to NxCx4xHxW, and then sum (or max) over the length-4 axis. This corresponds to pooling over the cosets of the subgroup of rotations around the origin. You can also sum over the H and W axes, to additionally get translation invariance.
Note that in our experiments we found that pooling over rotations inside the network in each layer is less effective than only pooling at the end or not doing any pooling.
Hope this helps. Let me know if something is not clear.
Yes that makes sense. Thanks so much! :)
Is G-Pooling included in this implementation?