XifengGuo / CapsNet-Keras

A Keras implementation of CapsNet in NIPS2017 paper "Dynamic Routing Between Capsules". Now test error = 0.34%.
MIT License
2.47k stars 651 forks source link

ValuerError #104

Closed Anselmoo closed 4 years ago

Anselmoo commented 4 years ago

Traceback (most recent call last): File "capsulenet.py", line 310, in routings=args.routings, File "capsulenet.py", line 60, in CapsNet )(primarycaps) File "/home/Anselmoo/.local/lib/python3.6/site-packages/keras/engine/base_layer.py", line 489, in call output = self.call(inputs, **kwargs) File "/home/Anselmoo/GitHub-Projects/CapsNet-Keras/capsule/capsulelayers.py", line 160, in call b += K.batch_dot(outputs, inputs_hat, [2, 3]) File "/home/Anselmoo/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1499, in batch_dot 'y.shape[%d] (%d != %d).' % (axes[0], axes[1], d1, d2))

ValueError: Can not do batch_dot on inputs with shapes (None, 10, 10, 1152, 16) and (None, 10, None, 1152, 16) with axes=[2, 3]. x.shape[2] != y.shape[3] (10 != 1152).

I was running the capsulelayers.py with the default settings.

gcfengxu commented 4 years ago

I encountered the same problems.It may be caused by the function of 'K.batch_dot()',but I don't know how to solve it. I have emailed to Xifeng Guo via 163 Mail,but get no reply yet. If you find a solution,please reply to me,thanks.

Anselmoo commented 4 years ago

Thx for the reply, I take look these days

gcfengxu commented 4 years ago

Hey,I found a solution from https://github.com/brjathu/deepcaps/commit/e273cfd686b9960fc3a9ef7772e4a9db95316593 .
It really works.

Anselmoo commented 4 years ago

But that's a new project, there is no way of merging? Or crearting a pull-request?

data-hound commented 4 years ago

@gcfengxu Has this own_batch_dot been tested by training+testing on the MNIST dataset?

gcfengxu commented 4 years ago

@Anselmoo Sorry,I don't quite understand what you mean.I just copy the 'batchdot.py' , add it to my project,and import the file,then replace all 'K.map_fn()' with 'own_batch_dot()'. I haven't analyzed how the function works.

gcfengxu commented 4 years ago

@data-hound Yes,I have tested it by training on the MNIST.Although the test is not completely over, it really in training.

Anselmoo commented 4 years ago

@Anselmoo Sorry,I don't quite understand what you mean.I just copy the 'batchdot.py' , add it to my project,and import the file,then replace all 'K.map_fn()' with 'own_batch_dot()'. I haven't analyzed how the function works.

@gcfengxu Maybe you can once commit your version and upload it as a pull request?? I think that would help a lot. Thx

gcfengxu commented 4 years ago

@Anselmoo @data-hound . Sorry,I mistook the function. In my project , I replace the function: K.batch_dot() with 'own_batch_dot().Sorry for the wrong word.

Anselmoo commented 4 years ago

@gcfengxu Thank you very much! I got it and I would recommend the following steps:

  1. Fetch this project
  2. Make a new branch
  3. Upload via git or via web-interface your own_batch_dot() and the other modified Files.
  4. Commit these files
  5. Create a pull request for @XifengGuo and we can take a look

I think it would be great if we can work together and add further test like MINST. I think that’s also the idea of open source and allows us to further modified the capsule net. The benchmark settings of this implementation looks the best and it would be sad if we cannot use it anymore.

gcfengxu commented 4 years ago

@Anselmoo Well, I am new to Github,I'll try your advice,thx for your suggestions!

Anselmoo commented 4 years ago

@gcfengxu no worries, everybody has to start 🛫 therefore we are here

data-hound commented 4 years ago

@Anselmoo @gcfengxu I think the own_batch_dot method is actually wrong as it was defined in the earlier versions, according to this issue: https://github.com/keras-team/keras/issues/13300

So, as a solution, in my implementation, I had used the K.batch_dot, and then tried to reshape it further to conform to the matrix shapes.

@gcfengxu are you running the model with or without eager execution? I have ran into some problems with eager execution, and the passing of graph tensors outside the graph. This occurs just before the end of the 1st training epoch. Let me know if using this method gets you past these

gcfengxu commented 4 years ago

@data-hound My tensorflow version is 2.0, I never run into problems with eager execetion. But the K.batch_dot still wrong in version 2.3.1 of Keras.

data-hound commented 4 years ago

Cool. So your training and testing tasks have completed successfully using this own_batch_dot method, then @gcfengxu ?

Also, as mentioned in the Keras issue I had mentioned, the behaviour of K.batch_dot will remain the same, for example,

x_batch = K.ones(shape=(1152, 10, 1, 8)) 
y_batch = K.ones(shape=(1152, 10, 8, 16))
xy_batch_dot = K.batch_dot(x_batch, y_batch, axes=(3, 2))
K.int_shape(xy_batch_dot) 

the above code will yield shape as (1152, 10, 1, 10, 16) and not (1152,10,1,16) as was returned in previous versions. I have not checked the mathematical accuracy for shapes with this operation, but this is what fchollet has confirmed to be accurate.

uchar commented 4 years ago

I'm using own_batch_dot it's running but I'm getting negative values for loss function after 3 or 4 epochs ! @data-hound Can you post your full code?

data-hound commented 4 years ago

@mrtucar it is known that the issue is with the new version of tensorflow and keras

@uchar I havent got past the 1st epoch, even with the own_batch_dot method. I am running on Kaggle. I will try on a fresh set up in a few days, and let you know

XifengGuo commented 4 years ago

@uchar @Anselmoo @mrtucar @data-hound @gcfengxu The problem is caused by the behavior change of keras.backend.batch_dot. In keras==2.0.7: a.shape->(2, 3, 4, 5), b.shape->(2, 3, 5, 6), batch_dot(a, b, (3, 2)).shape->(2, 3, 4, 6).
But in newer version: batch_dot(a, b, (3, 2)).shape->(2, 3, 4, 3, 6)

I propose to replace K.batch_dot with tf.matmul. For details please refer to https://github.com/XifengGuo/CapsNet-Keras/blob/9d7e641e3f30f0e8227bb6ad521a61e908c2408a/capsulelayers.py#L120-L163