Closed LiuCMU closed 2 years ago
Hi, thanks for your questions.
linear_rampup
with increase lambda_u
from 0 to 1 within 16 epochs after warmup is done (i.e., current epoch>warm_up epoch).Thanks for the reply!
I might figure out the linear_rampup
function: It is used to gradually increase the weights of the unsupervised loss, just as the name suggested😁 I got lost at the beginning. Thank you!
Maybe one more question about your design: why did you decide to use GMM to fit the losses, then divide the dataset into the labeled set and unlabeled set? Naively, is it possible to divide the training set without the GMM (i.e. a sample is considered labeled if it has small losses; considered unlabeled if it has large losses)?
Thank you very much!
The reason is that we observe gaussian distributions for the losses w.r.t correct&wrong labels.
Got it, thanks for taking the time to answer the questions! Thank you very much!
Hi, this is excellent work! I have read the paper and source code a few times in the past two weeks. They are inspiring, thanks for sharing them! I have two questions about your implementation, would you take a look when possible?
The first question is about
co-guessing
andlabel refinement
in thetrain
function. Is it safer to usenet.eval()
andnet2.eval()
in this block, then turn onnet.train()
before calculating thelogits
in line 101? I feel bothnet
andnet2
are used to prepare some labels in this block, which is just doing the evaluation. https://github.com/LiJunnan1992/DivideMix/blob/d9d3058fa69a952463b896f84730378cdee6ec39/Train_cifar.py#L62-L67The second question is about the
linear_rampup
function. I didn't understand the reason for multiplelambda_u
with the current epoch numbercurrent
. Could you explain that? https://github.com/LiJunnan1992/DivideMix/blob/d9d3058fa69a952463b896f84730378cdee6ec39/Train_cifar.py#L192-L194Thank you very much!