alirezazareian / ovr-cnn

A new framework for open-vocabulary object detection, based on maskrcnn-benchmark
MIT License
229 stars 28 forks source link

questions about splitting coco to coco open-vocabulary #17

Closed xishanhan closed 2 years ago

xishanhan commented 2 years ago

❓ Questions and Help

Hi, thanks for sharing your nice work! I have followed 001.ipynb to split COCO to COCO open-vocabulry, and successfully done this part. However, when I use the dataset, I found that the 'instances_val2017_all.json' include 4836 images, while the 'instances_val2017_seen.json' include 4533 and the 'instances_val2017_unseen.json' include 2064 images. In my thinking, 'all' should be equal to 'seen' plus 'unseen', but 4836 ≠ 4533 + 2064. So, I'm wondering if I was splitting the dataset wrong. I know this might be a stupid question, but still want your answer.

Martin0401 commented 2 years ago

Hi ,I can answer your question. 4836=4553 ∩ 2064. 4553 is numbers of images including seen class ,2064 is numbers of images including unseen class. There are some images including both seen and unseen class. So they will be computed twice.

xishanhan commented 2 years ago

Hi ,I can answer your question. 4836=4553 ∩ 2064. 4553 is numbers of images including seen class ,2064 is numbers of images including unseen class. There are some images including both seen and unseen class. So they will be computed twice.

Oh, thanks to your reply, I figured it out. Thank you very much!