Closed ChanLIM closed 3 years ago
Hi, thanks for your interest in DivideMix!
You are correct that the noise ratio does not equal to the "true" noise ratio. We have specified the difference between these two noise injection methods in Section 4.1, and reported results with "true" noise ratio in Table 6.
Got it. Thanks
In case of symmetric noise, it seems to me that some labels that were intended to be corrupted aren't actually corrupted.
Let's take 50% symmetric noise in CIFAR10(10 classes) for example. The code intends to apply noise to 25000 out of 50000 instances, but 10% of randomly labeled 25000 samples(since there are 10 classes) will be mapped back to their original labels, resulting in only 22500 noisy labeled samples. Because of this, in CIFAR10, 50% symmetric noise will actually end up in 45% noise rate. (49.5% noise rate in CIFAR100)
Adding the following lines in dataloader_cifar.py makes an exact 50% noise. https://github.com/LiJunnan1992/DivideMix/blob/d9d3058fa69a952463b896f84730378cdee6ec39/dataloader_cifar.py#L68