Regarding training complexity

tjddus9597 / Proxy-Anchor-CVPR2020

Official PyTorch Implementation of Proxy Anchor Loss for Deep Metric Learning, CVPR 2020

MIT License

314 stars 60 forks source link

Regarding training complexity #8

Closed jlian2 closed 3 years ago

jlian2 commented 4 years ago

Training complexity O(MC), M is batch size, C is number of classes. For most data sets like imagenet, C is much larger than M, which is to say, MC > M^2 or even MC > M^3, how do you explain this?
Look at the orginal Proxy NCA paper, it accent the point that it will take O((N/b)^3) steps to consider all samples while it does not mean O(b^3) is a large number, they are totally different thing. I think it is meaningless to illustrate that O(MC) is better than O(M^3).

jlian2 commented 4 years ago

I may kind of misunderstand symbols. But I am still confused. For proxy loss, training complexity is O(M/BCB)=O(MC), for triplet, it is O((M/B)^3 * B) = O(M^3/B^2) rather than O(M^3), any clarification on that?

darkpromise98 commented 4 years ago

I think the complexity analysis of other loss functions is fuzzy in paper...