Closed sarmientoj24 closed 3 years ago
Hi @sarmientoj24
If you would like to apply to the custom dataset and other degree of supervision, you could change this line https://github.com/facebookresearch/unbiased-teacher/blob/05dad84c8e1bb44c6fd14706571ab0769143e48d/ubteacher/data/build.py#L44
to your sampled numpy array.
num_label = int(sup_p / 100. * num_all).
labeled_idx = np.random.choice(range(num_all), size=num_label, replace=False)
sup_p
is labeled data ratio (e.g., 1 for 1%), and num_all
is total number of images in your dataset.
Also, please fix the random seed across GPUs and make sure each 50% labeled set are the same across different GPUs. Otherwise, you might get higher accuracy than it should be.
Please also remove line 41 and 42, which load the offline-split dataset.
Thanks!
I will close this issue as there is no other question. Welcome to reopen if you have other relevant questions.
Say I have 50K labeled images. I would like to check
So, do I still load/provide all labeled images then that line will already choose which images it will keep labeled and unlabeled?
Say, I wanted to compare using 50% labeled dataset (without unlabeled) only for supervised learning. And 50% labeled + 50% unlabeled for SSL. Is there a way to easily do the first one here?