Open geefer opened 6 years ago
Hi, I think the softmax in the routing algorithm is being calculated over the wrong dimension.
Currently the code has:
b_ij = Variable(torch.zeros(1, self.num_routes, self.num_capsules, 1)) ... for iteration in range(num_iterations): c_ij = F.softmax(b_ij)
and since the dim parameter is not passed to the
F.softmax
call it will choose dim=1 and compute the softmax over theself.num_routes
dimension (input caps or 1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.Thus the correct call should be:
c_ij = F.softmax(b_ij, dim=2)
I agree with you that Pytorch default dim for Softmax is 1 and follow the original paper the dim should be 2.
Hi, I think the softmax in the routing algorithm is being calculated over the wrong dimension.
Currently the code has:
and since the dim parameter is not passed to the
F.softmax
call it will choose dim=1 and compute the softmax over theself.num_routes
dimension (input caps or 1152 here) whereas the softmax should be computed so that the c_ij between each input capsule and all the capsules in the next layer should sum to 1.Thus the correct call should be: