Splits percentage (Quick question - easy to answer)

facebookresearch / unbiased-teacher

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

https://arxiv.org/abs/2102.09480

MIT License

409 stars 84 forks source link

Splits percentage (Quick question - easy to answer) #73

Closed fabianfallasmoya closed 2 years ago

fabianfallasmoya commented 2 years ago

I see that you use your own seeds (dataseed). I just want to know how you created this file with the seeds? when you say 1%, do you have 1% of every class?

I just sample 1% of the whole coco-dataset but the performance drops a lot, so, I am assuming that you get 1% per class so you can train with better results. Am I right?

ycliu93 commented 2 years ago

I randomly sample from the whole dataset, not per class, and I didn't use any class information during the labeled/unlabeled sampling. It is just a uniform sampling with the same sampling weight (1/ num_images_in_coco) for each image.

That means it is likely that a particular class does not have any labeled data in the training, and I didn't check whether each class has labeled data in the file.

I also tried to sample in an online manner, and the results are similar to using the file.

fabianfallasmoya commented 2 years ago

Thank you for the clarification. When I double-check I found what I was doing wrong when using the online sampling.