XifengGuo / CapsNet-Keras

A Keras implementation of CapsNet in NIPS2017 paper "Dynamic Routing Between Capsules". Now test error = 0.34%.
MIT License
2.47k stars 652 forks source link

if dim_capsule != 16 for CapsuleLayer an error appears when training #30

Closed ghost closed 7 years ago

ghost commented 7 years ago

Any idea why this happens? I'm talking about changing the 16 in line 46 in capsulenet.py to something else like 17 or 32.

# Layer 3: Capsule layer. Routing algorithm works here.
digitcaps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, num_routing=num_routing,
                         name='digitcaps')(primarycaps)

Isn't the weight matrix of dim (dim_capsule_1,dim_capsule_2) from capsule layer 1 to capsule layer 2, so it should work in theory right, just like in figure 2 of https://arxiv.org/pdf/1710.09829.pdf

EDIT: found the other hardcoded 16

ghost commented 7 years ago
Epoch 1/50
Traceback (most recent call last):
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1323, in _do_call
    return fn(*args)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
    status, run_metadata)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [100,170], In[1]: [160,512]
         [[Node: decoder/dense_1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mask_1/Reshape, dense_1/kernel/read)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/niels/to_3.5/CapsNet-Keras/capsulenet.py", line 209, in <module>
    train(model=model, data=((x_train, y_train), (x_test, y_test)), args=args)
  File "/home/niels/to_3.5/CapsNet-Keras/capsulenet.py", line 129, in train
    callbacks=[log, tb, checkpoint, lr_decay])
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/engine/training.py", line 2114, in fit_generator
    class_weight=class_weight)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/engine/training.py", line 1832, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2352, in __call__
    **self.session_kwargs)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
    options, run_metadata)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [100,170], In[1]: [160,512]
         [[Node: decoder/dense_1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mask_1/Reshape, dense_1/kernel/read)]]

Caused by op 'decoder/dense_1/MatMul', defined at:
  File "/home/niels/to_3.5/CapsNet-Keras/capsulenet.py", line 202, in <module>
    num_routing=args.num_routing)
  File "/home/niels/to_3.5/CapsNet-Keras/capsulenet.py", line 66, in CapsNet
    train_model = models.Model([x, y], [out_caps, decoder(masked_by_y)])
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/engine/topology.py", line 603, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/models.py", line 546, in call
    return self.model.call(inputs, mask)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/engine/topology.py", line 2061, in call
    output_tensors, _, _ = self.run_internal_graph(inputs, masks)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/engine/topology.py", line 2212, in run_internal_graph
    output_tensors = _to_list(layer.call(computed_tensor, **kwargs))
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/layers/core.py", line 843, in call
    output = K.dot(inputs, self.kernel)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1052, in dot
    out = tf.matmul(x, y)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 1891, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2437, in _mat_mul
    name=name)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/home/niels/anaconda3/envs/keras_GPU/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Matrix size-incompatible: In[0]: [100,170], In[1]: [160,512]
         [[Node: decoder/dense_1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mask_1/Reshape, dense_1/kernel/read)]]