naturomics / CapsLayer

CapsLayer: An advanced library for capsule theory
Apache License 2.0
361 stars 116 forks source link

EM Capsule Dense Layer Routing #31

Closed mukeshmithrakumar closed 5 years ago

mukeshmithrakumar commented 5 years ago

Why do you have routing in the class capsules layer for EM capsules?

naturomics commented 5 years ago

Did you mean the routing in this class? The last layers of papers dynamics routing between capsules and matrix capsule with EM routing are both a fully connected layer but use a different routing algorithm (dynamics vs. EM routing). This class wrap them up and easily choose one of them by the parameter 'routing_method'. Sorry just noticed that the doc didn't mention this.

mukeshmithrakumar commented 5 years ago

Thank you for getting back to me @naturomics . What I meant is, the convcaps2 layer is fully connected to the final class capsules layer but in the code you linked (https://github.com/naturomics/CapsLayer/blob/926ee89803dad3c273eafcd31a60718ddb9dad8d/capslayer/layers/layers.py#L32) on line 86, you are calling routing again in the class caps layer, that is where I was confused. To be more specific, in the paper: "When connecting the last convolutional capsule layer to the final layer we do not want to throw away information about the location of the convolutional capsules but we also want to make use of the fact that all capsules of the same type are extracting the same entity at different positions. We therefore share the transformation matrices between different positions of the same capsule type and add the scaled coordinate (row, column) of the center of the receptive field of each capsule to the first two elements of the right-hand column of its vote matrix. We refer to this technique as Coordinate Addition." please correct me if I am wrong, so I understood it as just coordinate addition and no routing inside class capsules

naturomics commented 5 years ago

OK, I got your point. As we know the final layer is a capsules layer, not a 'normal' layer. The routing is the main difference between them, if not doing routing, it turns out to be a normal layer:

consider a fully connected capsule layer: $Z=(P \odot W) X=W' X$, that $W'=P\odot W$ stands for element-wise multiply. What capsules do is optimizing P=h(X, WX), here h() for a clustering-like EM algorithm. If no routing, P is fixed and then $W'=P \odot W$ will act as what a linear transformation z=wx do in the normal neural network.

mukeshmithrakumar commented 5 years ago

@naturomics thanks a lot man, it makes sense