Closed Anselmoo closed 4 years ago
I encountered the same problems.It may be caused by the function of 'K.batch_dot()',but I don't know how to solve it. I have emailed to Xifeng Guo via 163 Mail,but get no reply yet. If you find a solution,please reply to me,thanks.
Thx for the reply, I take look these days
Hey,I found a solution from https://github.com/brjathu/deepcaps/commit/e273cfd686b9960fc3a9ef7772e4a9db95316593 .
It really works.
But that's a new project, there is no way of merging? Or crearting a pull-request?
@gcfengxu Has this own_batch_dot been tested by training+testing on the MNIST dataset?
@Anselmoo Sorry,I don't quite understand what you mean.I just copy the 'batchdot.py' , add it to my project,and import the file,then replace all 'K.map_fn()' with 'own_batch_dot()'. I haven't analyzed how the function works.
@data-hound Yes,I have tested it by training on the MNIST.Although the test is not completely over, it really in training.
@Anselmoo Sorry,I don't quite understand what you mean.I just copy the 'batchdot.py' , add it to my project,and import the file,then replace all 'K.map_fn()' with 'own_batch_dot()'. I haven't analyzed how the function works.
@gcfengxu Maybe you can once commit your version and upload it as a pull request?? I think that would help a lot. Thx
@Anselmoo @data-hound . Sorry,I mistook the function. In my project , I replace the function: K.batch_dot() with 'own_batch_dot().Sorry for the wrong word.
@gcfengxu Thank you very much! I got it and I would recommend the following steps:
own_batch_dot()
and the other modified Files.I think it would be great if we can work together and add further test like MINST. I think that’s also the idea of open source and allows us to further modified the capsule net. The benchmark settings of this implementation looks the best and it would be sad if we cannot use it anymore.
@Anselmoo Well, I am new to Github,I'll try your advice,thx for your suggestions!
@gcfengxu no worries, everybody has to start 🛫 therefore we are here
@Anselmoo @gcfengxu I think the own_batch_dot method is actually wrong as it was defined in the earlier versions, according to this issue: https://github.com/keras-team/keras/issues/13300
So, as a solution, in my implementation, I had used the K.batch_dot, and then tried to reshape it further to conform to the matrix shapes.
@gcfengxu are you running the model with or without eager execution? I have ran into some problems with eager execution, and the passing of graph tensors outside the graph. This occurs just before the end of the 1st training epoch. Let me know if using this method gets you past these
@data-hound My tensorflow version is 2.0, I never run into problems with eager execetion. But the K.batch_dot still wrong in version 2.3.1 of Keras.
Cool. So your training and testing tasks have completed successfully using this own_batch_dot method, then @gcfengxu ?
Also, as mentioned in the Keras issue I had mentioned, the behaviour of K.batch_dot will remain the same, for example,
x_batch = K.ones(shape=(1152, 10, 1, 8))
y_batch = K.ones(shape=(1152, 10, 8, 16))
xy_batch_dot = K.batch_dot(x_batch, y_batch, axes=(3, 2))
K.int_shape(xy_batch_dot)
the above code will yield shape as (1152, 10, 1, 10, 16)
and not (1152,10,1,16)
as was returned in previous versions. I have not checked the mathematical accuracy for shapes with this operation, but this is what fchollet has confirmed to be accurate.
I'm using own_batch_dot it's running but I'm getting negative values for loss function after 3 or 4 epochs ! @data-hound Can you post your full code?
@mrtucar it is known that the issue is with the new version of tensorflow and keras
@uchar I havent got past the 1st epoch, even with the own_batch_dot method. I am running on Kaggle. I will try on a fresh set up in a few days, and let you know
@uchar @Anselmoo @mrtucar @data-hound @gcfengxu
The problem is caused by the behavior change of keras.backend.batch_dot.
In keras==2.0.7: a.shape->(2, 3, 4, 5), b.shape->(2, 3, 5, 6), batch_dot(a, b, (3, 2)).shape->(2, 3, 4, 6)
.
But in newer version: batch_dot(a, b, (3, 2)).shape->(2, 3, 4, 3, 6)
I propose to replace K.batch_dot
with tf.matmul
.
For details please refer to https://github.com/XifengGuo/CapsNet-Keras/blob/9d7e641e3f30f0e8227bb6ad521a61e908c2408a/capsulelayers.py#L120-L163
Traceback (most recent call last): File "capsulenet.py", line 310, in
routings=args.routings,
File "capsulenet.py", line 60, in CapsNet
)(primarycaps)
File "/home/Anselmoo/.local/lib/python3.6/site-packages/keras/engine/base_layer.py", line 489, in call
output = self.call(inputs, **kwargs)
File "/home/Anselmoo/GitHub-Projects/CapsNet-Keras/capsule/capsulelayers.py", line 160, in call
b += K.batch_dot(outputs, inputs_hat, [2, 3])
File "/home/Anselmoo/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1499, in batch_dot
'y.shape[%d] (%d != %d).' % (axes[0], axes[1], d1, d2))
ValueError: Can not do batch_dot on inputs with shapes (None, 10, 10, 1152, 16) and (None, 10, None, 1152, 16) with axes=[2, 3]. x.shape[2] != y.shape[3] (10 != 1152).
I was running the
capsulelayers.py
with the default settings.