unlabeled set empty during training

stellaywu commented 4 years ago

Thanks for the nice implementation!

I am trying to run it on my own dataset with a slightly imbalance binary data (about 3: 1) . During the training after a few steps , the unlabeled set will be too small 0 or 1, and it will error out. Should i increase the p_threashold to encourage more samples in the unlabelled set or do some other tuning ? Does it work on imbalanced data? Not sure if it's learning because the unlabeled losses are so small the whole time. This is how the loss goes in the first few steps

Labeled loss: 0.63  Unlabeled loss: 0.04
Labeled loss: 0.65  Unlabeled loss: 0.03
Labeled loss: 0.60  Unlabeled loss: 0.03
Labeled loss: 0.63  Unlabeled loss: 0.01
Labeled loss: 0.65  Unlabeled loss: 0.02
Labeled loss: 0.30  Unlabeled loss: 0.00
Labeled loss: 0.22  Unlabeled loss: 0.02
Labeled loss: 0.63  Unlabeled loss: 0.03

Thanks so much!

LiJunnan1992 commented 4 years ago

Hi, thanks for trying our method!

Yes increasing p_threashold will put more samples in the unlabeled set. You may also try to adjust the unsupervised loss weight.

You can try some re-sampling approach for dealing with imbalanced data.

stellaywu commented 4 years ago

thanks! just a follow up questions , we have this

def eval_train(model,all_loss):    
    model.eval()
    losses = torch.zeros(50000)

is it true that 50000 need to be adjusted to equal the number of total training samples?

Thanks!

LiJunnan1992 commented 4 years ago

Yes!

LiJunnan1992 / DivideMix

unlabeled set empty during training #18