facebookresearch / Detic

Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".
Apache License 2.0
1.86k stars 211 forks source link

instances_train2017_seen_2_oriorder_cat_info #40

Open wusize opened 2 years ago

wusize commented 2 years ago

Would you please explain how instances_train2017_seen_2_oriorder_cat_info.json was generated or provide a link for it?

xingyizhou commented 2 years ago

Hi,

Sorry for missing this in the documentation. This is generated by python tools/get_lvis_cat_info.py --ann datasets/coco/zero-shot/instances_train2017_seen_2_oriorder.json.

If you don't mind, let me do a user study: what do you think is the main advantage of using the zero-shot COCO benchmark over the Open-vocabulary LVIS benchmark (e.g., computation/ GPU size/ disk space/ literature)? I personally like the LVIS setup much more, and hope to advocate it to the community. Please let me know what I can do to make the LVIS setup more popular. Thank you!

Best, Xingyi

wusize commented 2 years ago

I think the main advantage of zero-shot COCO benchmark is the computation/ GPU. Not everyone has dozens of V100s :).

xingyizhou commented 2 years ago

Hi Size,

Got it. Thank you for letting me know!

Actually our LVIS experiments use similar resources with the COCO setup (see the model zoo: LVIS 17h on 8 GPUs vs. COCO 12h on 8 GPUs). If the GPU memory is the bottleneck, simply reducing the batch-size with the linear learning rate rule will do the trick. We'll highlight this in our documentation.

Best, Xingyi