Open walkingwindy opened 3 years ago
def __get_relative_prob(self, all_close_nei, back_nei_probs): relative_probs = tf.reduce_sum( tf.where( all_close_nei, x=back_nei_probs, y=tf.zeros_like(back_nei_probs), ), axis=1) relative_probs /= tf.reduce_sum(back_nei_probs, axis=1) return relative_probs
Ahh, I find a tf version of the same loss function, where 'keepdims = None' both in numerator and denominator. SO, is there a mistake in the pytorch version?
Thanks for noticing this. But I believe it is a problem that will not influence the results, the pytorch version automatically broadcast the dimension to make the difference not influence the results. As we have said, the pytorch version has been verified, the models trained should be very similar to the tensorflow version.
Thanks for replying. In your implementation, the dimension of numerator is [batchsize] and the dimension of denominator is [batchsize, 1], the broadcasting will result in [batchsize, batchsize] instead of [batchsize, 1]. In my opinion, this loss function can be formulated as 1/N * \sum_{j=1}^{N}(-log\frac{P(Ci & Bi | Vi)}{P(Bj | Vi)}) where N is the batchsize, which is different from equation 3 in Paper Section 3.2.
Hello, I'm curious about the function __get_relative_prob in Class: LocalAggregationLossModule. Specifically, why you set 'keepdim = True' in function torch.sum conducted on back_nei_probs, while, 'keepdim = False' (default setting) in function torch.sum conducted on relative_probs. I think that these settings make the feature dimensions different between the numerator and denominator when calculating equation 3 in Paper Section 3.2, which specifically, the dimension of numerator is [batchsize] and the dimension of denominator is [batchsize, 1]. Is my understanding correct?And why you set 'keepdim = True' or what if we set 'keepdim = False' for both numerator and denominator?