brjathu / deepcaps

Official Implementation of "DeepCaps: Going Deeper with Capsule Networks" paper (CVPR 2019).
MIT License
151 stars 48 forks source link

Confusion in function update_routing #4

Closed HopefulRational closed 5 years ago

HopefulRational commented 5 years ago

@brjathu Hi. I've read your paper and it's great that you've given your code. Thanks a lot. I have a few doubts in the ConvCapsuleLayer3D layer. Could you clarify them for me?

  1. In the paper, the softmax_3D function was given, with the denominator summed over spatial positions and the capsules in the l+1 th layer. The implementation, however, uses nn.softmax whose denominator sums over only the capsules in update_routing. Could you clarify this?
  2. In the update_routing function, I am unable to understand the reshaping operations. It seems that the dimension for the batch size is being changed. Could you clarify the sizes used? Thank you.
brjathu commented 5 years ago

@HopefulRational Hi, Thanks for your interest in our work.

  1. Our apologies for making the code, and figs bit complex. In the paper the l+1 means the input to the l+1th layer, which is actually the output from layer l just after routing. Further, since the routing is iterative, this softmax is applied inside update_routing function on the output of the routing at each iteration.
  2. Yes, I accept that the reshaping is too complicated. Yes, the batch_size dimension is changed to keep the multiplication with logits, without keeping the number of atoms in the dimension. but votes_t_shape and r_t_shape are simply transpose of themselves, and r_t_shape used later convert the votes back to early shape.

Again we are sorry for all the trouble you have to go through to understand the code, we will be releasing the pytorch implementation soon, which would be much easier to understand.

Please feel free to ask any questions or concerns you have.

Thank you very much.

HopefulRational commented 5 years ago

Hi, thank you for your quick reply. I am waiting for the pytorch release! While I understand the shapes (somewhat) in the convcapsulelayer3d class routing, I still do not see how the softmax in the code and the softmax3d function in the paper are the same. tf.nn.softmax() is not the same as softmax3d, as softmax3d sums over spatial positions and capsules, while tf.nn.softmax() sums only over the capsules. This is in the function update_routing. Could you clarify, please?

brjathu commented 5 years ago

Hi,

I see your point, Let me check with other authors and compare it with our the base code and I will get back to you with correct explanation.

Regards, Jathushan Rajasegaran

On Jul 26, 2019, at 6:44 PM, HopefulRational notifications@github.com wrote:

Hi, thank you for your quick reply. I am waiting for the pytorch release! While I understand the shapes (somewhat) in the convcapsulelayer3d class routing, I still do not see how the softmax in the code and the softmax3d function in the paper are the same. tf.nn.softmax() is not the same as softmax3d, as softmax3d sums over spatial positions and capsules, while tf.nn.softmax() sums only over the capsules. This is in the function update_routing. Could you clarify, please?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brjathu/deepcaps/issues/4?email_source=notifications&email_token=ABZNVOL3VFCM767UUOR2F7DQBMETDA5CNFSM4IHAP2D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD242BTA#issuecomment-515481804, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZNVONDUPJJV634I6JYUFTQBMETDANCNFSM4IHAP2DQ.

HopefulRational commented 5 years ago

Thank for your reply. Eagerly awaiting a clarification

ierosodin commented 5 years ago

@HopefulRational Hi, Thanks for your interest in our work.

  1. Our apologies for making the code, and figs bit complex. In the paper the l+1 means the input to the l+1th layer, which is actually the output from layer l just after routing. Further, since the routing is iterative, this softmax is applied inside update_routing function on the output of the routing at each iteration.
  2. Yes, I accept that the reshaping is too complicated. Yes, the batch_size dimension is changed to keep the multiplication with logits, without keeping the number of atoms in the dimension. but votes_t_shape and r_t_shape are simply transpose of themselves, and r_t_shape used later convert the votes back to early shape.

Again we are sorry for all the trouble you have to go through to understand the code, we will be releasing the pytorch implementation soon, which would be much easier to understand.

Please feel free to ask any questions or concerns you have.

Thank you very much.

Looking forward to the pytorch implementation!

HopefulRational commented 5 years ago

@ierosodin Hi. I have put a quick and dirty PyTorch implementation of DeepCaps as a .py file.You may have a look at that: https://github.com/HopefulRational/DeepCaps-PyTorch It is being trained on transformed FashionMNIST and tested of FashionMNIST.

brjathu commented 5 years ago

Hi,

Thank you very much, it seems fine for me, Did you get a chance to run it on CIFAR10?

Regards, R.Jathushan

On Thu, 1 Aug 2019 at 07:33, HopefulRational notifications@github.com wrote:

@ierosodin https://github.com/ierosodin Hi. I have put a quick and dirty PyTorch implementation of DeepCaps as a .py file.You may have a look at that: https://github.com/HopefulRational/DeepCaps-PyTorch It is being trained on transformed FashionMNIST and tested of FashionMNIST.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brjathu/deepcaps/issues/4?email_source=notifications&email_token=ABZNVOIN3N624UVJ3VD6AHTQCJKO3A5CNFSM4IHAP2D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3JGIBI#issuecomment-517104645, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZNVOK7LY56NMXWB4E327LQCJKO3ANCNFSM4IHAP2DQ .

HopefulRational commented 5 years ago

@ierosodin Hi. Thanks for liking it. Hope the bare-bones code was helpful :) @brjathu Hi. May be i will just run on untransformed FashionMNIST once.

brjathu commented 5 years ago

@HopefulRational Hi, I sort of fixed the update_routing function. please check and Thank you very much for finding this bug.

HopefulRational commented 5 years ago

@brjathu Hi. Is this effecting the accuracies?

brjathu commented 5 years ago

I haven't tested it yet, I will run it today and let you know.

Regards, R.Jathushan

On Tue, 6 Aug 2019 at 11:37, HopefulRational notifications@github.com wrote:

@brjathu https://github.com/brjathu Hi. Is this effecting the accuracies?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brjathu/deepcaps/issues/4?email_source=notifications&email_token=ABZNVOM2PRMMXRMYW2A5L5TQDES3RA5CNFSM4IHAP2D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3UGQRI#issuecomment-518547525, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZNVOKRT5VRZEBQYHLHNTLQDES3RANCNFSM4IHAP2DQ .

HopefulRational commented 5 years ago

@brjathu Thanks.

HopefulRational commented 5 years ago

@brjathu by the way, I will go through the code in the evening coz i m little busy now.

brjathu commented 5 years ago

Test accuracy on cifar10 - 91.49%

Almost same as earlier one.

Regards, R.Jathushan

On Tue, 6 Aug 2019 at 11:44, HopefulRational notifications@github.com wrote:

@brjathu https://github.com/brjathu by the way, I will go through the code in the evening coz i m little busy now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brjathu/deepcaps/issues/4?email_source=notifications&email_token=ABZNVOIZV7GQ7JIFSDMVBOTQDETUPA5CNFSM4IHAP2D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3UHD4Q#issuecomment-518550002, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZNVONRFJ7VSFRWO4Y5SADQDETUPANCNFSM4IHAP2DQ .

ierosodin commented 5 years ago

Hi @brjathu , I still have a few doubts. I'm wondering what's the meaning of "localized voting" you have mentioned in the paper. For example, for each capsule in capsule layer l+1, what are the capsules it routes in capsule layer l? Every capsule in layer l across all of the location? or just in a limited kernel. Thanks!

Sincerely, Tony

brjathu commented 5 years ago

Hi,

Localized routing comes because, with a 3*3 kernel a nine local capsules are routed together.

Regards, R.Jathushan

On Wed, 7 Aug 2019 at 10:42, ierosodin notifications@github.com wrote:

Hi @brjathu https://github.com/brjathu , I still have a few doubts. I'm wondering what's the meaning of "localized voting" you have mentioned in the paper. For example, for each capsule in capsule layer l+1, what are the capsules it routes in capsule layer l? Every capsule in layer l across all of the location? or just in a limited kernel. Thanks!

Sincerely, Tony

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brjathu/deepcaps/issues/4?email_source=notifications&email_token=ABZNVOM5UGSMGOVJALAOA73QDJVFJA5CNFSM4IHAP2D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3XL43Q#issuecomment-518962798, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZNVOIGGV7DVNYMRJRPEE3QDJVFJANCNFSM4IHAP2DQ .