Closed jizongFox closed 4 years ago
The only thing I can think of is that the the repo owner wants to get lds before running stats of the batch norms get changed by forwarding.
But in VATLoss
the author use _disable_tracking_bn_stats
so the BN statistics shouldn't be a problem. I am still curious why " # LDS should be calculated before the forward for cross entropy".
Hi, I am wondering why you commented " # LDS should be calculated before the forward for cross entropy". What would be the specific reason for that?