LiJunnan1992 / DivideMix

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning
MIT License
543 stars 84 forks source link

Some discussions about DivideMix implementation #44

Closed LiuCMU closed 2 years ago

LiuCMU commented 2 years ago

Hi, this is excellent work! I have read the paper and source code a few times in the past two weeks. They are inspiring, thanks for sharing them! I have two questions about your implementation, would you take a look when possible?

The first question is about co-guessing and label refinement in the train function. Is it safer to use net.eval() and net2.eval() in this block, then turn on net.train() before calculating the logits in line 101? I feel both net and net2 are used to prepare some labels in this block, which is just doing the evaluation. https://github.com/LiJunnan1992/DivideMix/blob/d9d3058fa69a952463b896f84730378cdee6ec39/Train_cifar.py#L62-L67

The second question is about the linear_rampup function. I didn't understand the reason for multiple lambda_u with the current epoch number current. Could you explain that? https://github.com/LiJunnan1992/DivideMix/blob/d9d3058fa69a952463b896f84730378cdee6ec39/Train_cifar.py#L192-L194

Thank you very much!

LiJunnan1992 commented 2 years ago

Hi, thanks for your questions.

  1. Yes it is safe to turn the model to evaluation mode and then back.
  2. Sorry I don't fully understand your second question. linear_rampup with increase lambda_u from 0 to 1 within 16 epochs after warmup is done (i.e., current epoch>warm_up epoch).
LiuCMU commented 2 years ago

Thanks for the reply!

I might figure out the linear_rampup function: It is used to gradually increase the weights of the unsupervised loss, just as the name suggested😁 I got lost at the beginning. Thank you!

LiuCMU commented 2 years ago

Maybe one more question about your design: why did you decide to use GMM to fit the losses, then divide the dataset into the labeled set and unlabeled set? Naively, is it possible to divide the training set without the GMM (i.e. a sample is considered labeled if it has small losses; considered unlabeled if it has large losses)?

Thank you very much!

LiJunnan1992 commented 2 years ago

The reason is that we observe gaussian distributions for the losses w.r.t correct&wrong labels.

LiuCMU commented 2 years ago

Got it, thanks for taking the time to answer the questions! Thank you very much!