share weights in 6x6x8 grids

naturomics / CapsNet-Tensorflow

A Tensorflow implementation of CapsNet(Capsules Net) in paper Dynamic Routing Between Capsules

Apache License 2.0

3.8k stars 1.16k forks source link

share weights in 6x6x8 grids #47

Open yaxinshen opened 6 years ago

yaxinshen commented 6 years ago

In paper, each capsule in the [6 × 6] grid is sharing their weights with each other and is your code miss this point?

veshboo commented 6 years ago

@yaxinshen +1, I tried to imagine how to share weights in that way. How about introducing for _ in range(1152/36): when tf.matmul involving W in routing function? Other idea not losing vectorization?

EDIT ~~Oh, I found tf.scan which is commented out was for the sharing weights. The author preferred tf.tile to tf.scan for performance!~~

EDIT2 This issue seems to be a duplication of previous issue, questions about the weight maxtrix Wij between ui and vj and it makes me clear.

tonyzhao6 commented 6 years ago

@yaxinshen,

Version 1 (i.e., the computationally expensive approach) does have 8 distinct set of weights for each 6 x 6 x 32 tensor. This is what the paper does.

Version 2 technically has 1 distinct set of weights for the entire 6 x 6 x 256 block and then reshapes the output to the correct shape.

I don't know if this actually matters in practice => The network will eventually learn the correct weights, whether it's 8 or 1 distinct sets.