About the unlabeled data.

nayeemrizve / ups

"In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning" by Mamshad Nayeem Rizve, Kevin Duarte, Yogesh S Rawat, Mubarak Shah (ICLR 2021)

MIT License

231 stars 40 forks source link

We return four datasets from our get_cifarX function.

Ref: https://github.com/nayeemrizve/ups/blob/f003e3fcb0316b21904499ada4b65a765198fcb8/data/cifar.py#L88

For training, we use train_lbl_dataset (includes the available labeled set and the pseudo-labeled set for training with CE loss) and train_nl_dataset (includes the negatively pseudo-labeled set for training with NCE loss)

We use train_unlbl_dataset (includes all the original unlabeled samples) for generating pseudo-labels at each pseudo-label generation step. Since we do not reuse the pseudo-labels from one pseudo-labeling iteration to the next iteration, we do not delete the already generated pseudo-labels from the unlabeled set.

nayeemrizve / ups

About the unlabeled data. #3