LiJunnan1992 / DivideMix

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning
MIT License
543 stars 84 forks source link

Question about intuition of fitting loss to GMM #22

Closed hxu38691 closed 4 years ago

hxu38691 commented 4 years ago

Hello, I am new to topic about label noise but very interested in your algorithm, I have two questions in mind if you can help provide some insights into

  1. Why fitting loss to GMM instead of others, such as dimension reduced learnt representations, have you experimented with other settings?

  2. Related to the first question, if using loss as input to GMM, how is the inference done if validation set also contains noisy labels? Can we still separate clean/noisy label without posterior loss?

Thank you

LiJunnan1992 commented 4 years ago

Hi, thanks for your interest!

  1. The per-sample loss highly correlates with the correctness of the label and can be modeled with univariate GMM. Dimension reduced representations may not have such an obvious pattern.

  2. In inference the label is not given, and only the network is used for prediction.

hxu38691 commented 4 years ago

I just realized samples are clean...

Thanks, I’m closing this.