neuroailab / LocalAggregation-Pytorch

88 stars 17 forks source link

The function in Class: LocalAggregationLossModule #1

Open sudalvxin opened 4 years ago

sudalvxin commented 4 years ago

I cannot understand the formulation in function "def _softmax". What is the value of 2876934.2 ?

chengxuz commented 4 years ago

Hi Zhanxuan,

Thanks for your question. We should have made it clearer in the codes. In this "_softmax" function, Z serves as a normalization factor for the non-parametric softmax formulation. This value should be \sum_0^N exp(v_i^T v / \tau) under the non-parametric softmax framework. As this framework was first introduced in the Instance Discrimination paper (IR method in our paper), we refer to the source codes of that paper for the implementation and in their implementation, this "Z" value is only computed at the beginning using initial weights and then fixed. This magic number "2876934.2" is the computed Z from the initial weights. As this number is proportional to the number of data points, we have another scale according to "data_len" in the function.

That being said, this "Z" value actually doesn't influence our loss. Because our loss is a conditional probability, "Z" will finally be cancelled out in our loss.

Please let me know if you still have questions! Sorry for the confusion!

sudalvxin commented 4 years ago

Thanks for your reply. I will check the source codes of IR.

sudalvxin commented 4 years ago

Yes! I find that 'Z' will finally be cancelled out in Eq.(3).

WonderSeven commented 4 years ago

Hi, a similar question. When I use the _softmax() function, the output prob is always too large to optimize (about 1600), since 2876934.2 does not make sence, can I manually set this parameter and 1281167 to make the output much smaller?

chengxuz commented 4 years ago

Hi SuiAn,

That is possible, although this value does not make a difference in our algorithm. You can also check Instance Discrimination's original implementation.

WonderSeven commented 4 years ago

Thanks for your reply, the original implementation of Instance Discrimination is here. I will check it.