Closed jianyin2016 closed 6 years ago
@jianyin2016 I had the same doubt, and this statement(below) in the paper is kind of increasing the simplicity, but also hindering the proper understanding of what exactly is a capsule unit:
1. Wij is a weight matrix between each u_i, where i belongs to (1, 32x6x6) in PrimaryCapsules and v_j, where j belongs to (1, 10).
2. Wij =[8x16] is also mentioned in the diagram.
It would be great if someone could explain this part in the context of a capsule layer and capsule units.
@rrqq hello,man.
I studied a couple of implementations these days and I found what I previously think is wrong.
Actually,I tend to believe the ”sharing weights“ mentioned above means each capsule in the [6,6] share a same set of filters to get the [1,8] vector,generally speaking,it is a common sense in DL what sharing weight means and it make sense.
This issue should be closed as thought I have doubts in all the implementations released because they seem to be not very sure of their implementations themselves.
Firstly.thanks for your answer on zhihu as well as the implementation on github, it helps me a lot understanding the original paper.
I would like to share my doubt about the very lines just below the figure 2 of the original paper which says "each capsule in the [6,6] grid is sharing their weights with each other".which by my understanding ,means capsule outputs(vector ui) among a [6,6] grid shares the same Wij.thus,just 32 W should be updated using adam.but in your implementation ,I can't find any codes to handle the weights sharing mechanism.
Besides,I think the shape of Wij should be [16,8] as the ui is [1,8] or [8,1] vector and obviously conflicts with the Eq 2 .although it looks like a problem without any importance,I pick it out so that i would be righted if i am wrong with understanding this paper and your implementation.