Easiest way to use this for supervised object detection as baseline?

sarmientoj24 commented 3 years ago

Say, I wanted to compare using 50% labeled dataset (without unlabeled) only for supervised learning. And 50% labeled + 50% unlabeled for SSL. Is there a way to easily do the first one here?

ycliu93 commented 2 years ago

Hi @sarmientoj24

If you would like to apply to the custom dataset and other degree of supervision, you could change this line https://github.com/facebookresearch/unbiased-teacher/blob/05dad84c8e1bb44c6fd14706571ab0769143e48d/ubteacher/data/build.py#L44

to your sampled numpy array.

num_label = int(sup_p / 100. * num_all). labeled_idx = np.random.choice(range(num_all), size=num_label, replace=False) sup_p is labeled data ratio (e.g., 1 for 1%), and num_all is total number of images in your dataset.

Also, please fix the random seed across GPUs and make sure each 50% labeled set are the same across different GPUs. Otherwise, you might get higher accuracy than it should be.

Please also remove line 41 and 42, which load the offline-split dataset.

Thanks!

ycliu93 commented 2 years ago

I will close this issue as there is no other question. Welcome to reopen if you have other relevant questions.

sarmientoj24 commented 2 years ago

Say I have 50K labeled images. I would like to check

Only using 25K labeled (supervised learning)
Using 25K labeled and 25K unlabeled
Using 5K labeled and 45K unlabeled

So, do I still load/provide all labeled images then that line will already choose which images it will keep labeled and unlabeled?

facebookresearch / unbiased-teacher

Easiest way to use this for supervised object detection as baseline? #32