DeLightCMU / RSC

This is the official implementation of Self-Challenging Improves Cross-Domain Generalization, ECCV2020
BSD 2-Clause "Simplified" License
160 stars 18 forks source link

Choice of datapoints to which RSC is applied #18

Closed AhmedFrikha closed 3 years ago

AhmedFrikha commented 3 years ago

In the paper it is mentioned that RSC is applied to a random subset of the current batch. But it seems that from lines 126 to 146 in resnet.py, something more sophisticated is performed.

a) Can you explain what it done in that part of the code, especially the meaning of variables used in lines 142 to 146?
b) Why is the mask a variable that requires grad ? (line 149 in resnet.py)

AhmedFrikha commented 3 years ago

I got an answer to my question (a) from this issue #10. But I still don't understand, why you turn the mask into a trainable variable ?

Justinhzy commented 3 years ago

We mention in the paper that applying RSC to the top percentage of batch samples based on cross-entropy loss is slightly better than randomness. It doesn’t matter if you turn the mask into a trainable variable because it is an extra input.