naturomics / CapsNet-Tensorflow

A Tensorflow implementation of CapsNet(Capsules Net) in paper Dynamic Routing Between Capsules
Apache License 2.0
3.8k stars 1.17k forks source link

I have question about loss function. something wrong, i think....? #71

Open hhjung1202 opened 6 years ago

hhjung1202 commented 6 years ago

There seems to be a problem in V_k when calculating the loss in the function.

in capsNet.py at loss function (line 106?)

max_l = tf.square(tf.maximum(0., cfg.m_plus - self.v_length)) max_r = tf.square(tf.maximum(0., self.v_length - cfg.m_minus))

you use self.v_length when calculating max_l and max_r. self.v_length = tf.sqrt(reduce_sum(tf.square(self.caps2), axis=2, keepdims=True) + epsilon) self.caps2 is V_J and V_J's Dim is [batch_size, 10, 16, 1] V_J = squash(S_J), so all of value in V_J is (-1, 1)

if it calculate self.v_length, the dim of self.v_length is [batch_size, 10, 1, 1] and interval of value is (-4, 4)

so i think, it has to change like this max_l = tf.square(tf.maximum(0., cfg.m_plus - (self.v_length)/4)) max_r = tf.square(tf.maximum(0., (self.v_length)/4 - cfg.m_minus))

if it is wrong, Can you tell me what 's wrong?

hhjung1202 commented 6 years ago

im sorry i know that vj is like unit vector of 16 D

naturomics commented 6 years ago

You do not understand squashing correctly. Squashing ensures the length (euclidean norm) of its output vector is in [0, 1], not about the element in the vector. Have fun with the following code:

import numpy as np from matplotlib import pyplot as plt

def squash(vector, axis=None):

norm = np.linalg.norm(vector, axis=axis, keepdims=True)` norm_squared = np.square(norm) scalar_factor = norm_squared / (1 + norm_squared) return scalar_factor * (vector / norm)

#Create 10000 samples each with 3 elements num_samples = 10000 c1 = np.random.uniform(size=(num_samples, 1)) c2 = np.random.normal(size=(num_samples, 1)) c3 = np.random.logistic(size=(num_samples, 1)) vectors = np.hstack((c1, c2, c3)) # [num_samples, 3] squashed_vector = squash(vectors, axis=1) length = np.sqrt(np.sum(np.square(squashed_vector), axis=1)) plt.plot(np.sort(length))

parinaya-007 commented 5 years ago

Look "reduce_mean" function in utils.py carefully and also after this function is applied using axis=2, we are taking the mean of the sum of squared sum of vector so the range is not (-4, 4) it is (0, 1).