apple / ml-capsules-inverted-attention-routing

Other
124 stars 25 forks source link

DiverseMultiMNist config file. #2

Open cinjon opened 4 years ago

cinjon commented 4 years ago

Hi, could you please post a config file for the DiverseMultiMNist?

yaohungt commented 4 years ago

You can modify the config based on Tables 10-12 in the paper.

cinjon commented 4 years ago

Is this correct?

{'params': {'backbone': {'kernel_size': 3,
   'output_dim': 1024,
   'input_dim': 3,
   'stride': 2,
   'padding': 1,
   'out_img_size': 18},
  'primary_capsules': {'kernel_size': 3,
   'stride': 2,
   'input_dim': 1024,
   'caps_dim': 64,
   'num_caps': 16,
   'padding': 0,
   'out_img_size': 8},
  'capsules': [{'type': 'CONV',
    'num_caps': 16,
    'caps_dim': 64,
    'kernel_size': 3,
    'stride': 1,
    'matrix_pose': True,
    'out_img_size': 6},
   {'type': 'FC', 'num_caps': 10, 'caps_dim': 64, 'matrix_pose': True}],
  'class_capsules': {'num_caps': 10, 'caps_dim': 64, 'matrix_pose': True}}}
yaohungt commented 4 years ago

It seems syntactically correct.

cinjon commented 4 years ago

Do you mind verifying that it is actually correct please? That would be quite helpful so that I don't spend a week debugging when it's not actually right.

yaohungt commented 4 years ago

I have no access to machines right now. But your config file is correct except that "True" should be "true".

We also conduct a variant of your config which is the vector-structured pose. This is done by setting 'matrix_pose' to false.

cinjon commented 4 years ago

With the config as is, it reaches only 73% accuracy on the diverse multi mnist test set. That's notably lower than the result in the paper.

It also is at 9.98m parameters, which is a tiny bit larger than what you shared. So I suspect it's off in some way from the config you used.

Any idea where it could be off or perhaps some other part that's different between the setup (for CIFAR) in the repo and the setup for DMM? Fwiw, it reaches ~99.99%/100% on the training set np.

yaohungt commented 4 years ago

I'm not sure your training/test match the setting in the paper or not. You can first try the vanilla Capsules (Dynamic/ EM Capsules) on multimnist, and see what you get.

Moreover, I don't think I reach ~99.99% accuracy on the training set.

Sharut commented 4 years ago

Hi @yaohungt , Could you please share your code for generating the Diverse Multi MNIST Dataset. When I tried to implement it in MNIST, Using the above config file I am able to cross 85% with the matrix pose itself whereas your number in the paper was 80%.

I really need to figure out where exactly I am messing up or is there something else which is causing this difference?