frank-xwang / RIDE-LongTailRecognition

[ICLR 2021 Spotlight] Code release for "Long-tailed Recognition by Routing Diverse Distribution-Aware Experts."
MIT License
261 stars 26 forks source link

The mismatch between your LDAM implementation and the original one #4

Closed lipingcoding closed 3 years ago

lipingcoding commented 3 years ago

I found your implementation is a little different with the original implementation https://github.com/kaidic/LDAM-DRW. What's more, there is a issue about this in original repository https://github.com/kaidic/LDAM-DRW/issues/13. And I don't know which one is better or correct?

TonyLianLong commented 3 years ago

Thanks for your question for this.

I noticed that LDAM multiplies the value by s, and we have it here: https://github.com/frank-xwang/RIDE-LongTailRecognition/blob/main/model/ldam_drw_resnets/ride_resnet_cifar.py#L175.

It's only an implementation detail that leads to the same computation, so we have the same LDAM computation when you compare to the LDAM codebase you present.

I believe that in LDAM the value is multiplied to adjust effect normalization causes. However, this is not a focus of us because we use LDAM as a base loss and ensuring the implementation is the same is enough for us. I believe the LDAM authors will present you with a good answer.