Is it right to use `torch.mean(s_tgt)` when C < B？

cuishuhao / BNM

code of Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations (CVPR2020 oral)

MIT License

263 stars 30 forks source link

Hi and I am studying your approach with your implementation. My question is that in your paper you use $\mathcal{L}_{b n m}=-\frac{1}{B_{U}}\left\|G\left(X^{U}\right)\right\|_{\star}$ (Equation 12) to compute the BNM loss, and the divisor is the batch size. But in BNM/DA/BNM/train_image.py L#164 I found that this is done with torch.mean(). Then if the class number is smaller than batch size, the SVD operation will generate a s_tgt with length C instead of B. Wouldn't that be incorrect according to the original equation? Why don't explicitly divide with the batch size?

cuishuhao / BNM

Is it right to use `torch.mean(s_tgt)` when C < B？ #13