joapolarbear / dl_notes

1 stars 1 forks source link

NIPS 2020 | Rethinking the Value of Labels for Improving Class-Imbalanced Learning #30

Open joapolarbear opened 3 years ago

joapolarbear commented 3 years ago

Below are notes from here.

Do we need labels when existing labels are class imbalanced (some classes have more labeled examples than others) and we have a lot of unlabeled data?

  • Positive. Yes, we need labels. Self-train on the unlabeled data and you would be golden. (Self-training is a process where an intermediate model, which is trained on human-labeled data, is used to create ‘labels’ (thus, pseudo labels) and then the final model is trained on both human-labeled and intermediate model labeled data).

  • Negative. We may do away with the labels. One can use self-supervised pretraining on all the data available to learn meaningful representations and then learn the actual classification task. It is shown that this approach improves performance.

Takeaway: If you have class-imbalanced labels and more unlabeled data, do self-training or self-supervised pretraining. (It is shown that self-training beats self-supervised learning on CIFAR-10-LT though).

Keywords: imbalanced classification,

What is