Open Fabio-Arup-Panella opened 3 years ago
If this helps, I can suggest you to:
Regarding the last part, to pick the images for the (un)supervised learning, the authors just randomly split images at the beginning by sampling a list of indices (see divide_label_unlabel
here). However, in the current implementation, they actually read pre-generated indices to make results reproducible. You may use exactly the same trick to distinguish between (un)labeled images. E.g. you can order the images in your dataset such that the first 120 are labeled, and the rest 380 are not and reflect it in the seed (or just hardcode it).
For the images that were picked to be used for the "unsupervised part", the authors just delete the labels inside the training loop (see run_step_full_semisup
here ).
At this point, I am not sure if you can supply Detectron2 with your 380 images without labels (it may skip them), - if yes, you can just put your images in a format similar to what you mentioned, but if at least 1bbox per image is required, one idea could be to add some random annotations for them, as, anyway, those would be removed inside the training loop.
@vlfom
if yes, you can just put your images in a format similar to what you mentioned, but if at least 1bbox per image is required, one idea could be to add some random annotations for them, as, anyway, those would be removed inside the training loop.
Does this mean all images (both labeled and unlabeled) should have annotations with them?
Hi Yen-Cheng, I am working on a project where, because of some issues, we were able to label only a proportion of the dataset. Let's say, out of 500 images only 120 were labelled. Is it possible to use all the 120 as training labelled data and the rest as training unlabelled data? If so, how do you recommend addressing this? Below is an example of annotations (of course I can modify it)