Closed mjlm closed 2 years ago
Hi,
Thank you for your interest! The results are not in the paper. I attached them below:
LVIS box mAP | LVIS box mAP rare | Objects365 box mAP | OpenImages box mAP50 | |
---|---|---|---|---|
Box-Supervised | 45.0 | 39.2 | 19.1 | 46.2 |
Detic w. IN-L | 46.7 | 45.1 | 21.2 | 53.0 |
Detic w. IN-21K | 45.0 | 41.2 | 21.4 | 55.2 |
Detic w. IN-21K
is worse than Detic w. IN-L
on LVIS as expected, due to less-focused vocabulary in training. It still outperforms the Box-supervised baseline on rare classes.
Best, Xingyi
Thanks for sharing these results!
Hi,
Thanks for publishing Detic, very interesting work!
As far as I can tell, the LVIS numbers in the paper were all obtained using image-level data that only contains classes overlapping with LVIS (i.e. "IN-L", or CC captions containing LVIS classes).
What is the LVIS performance when image-level data covering all 22000 ImageNet-21k-classes is used for training? Sorry if this is in the paper and I missed it!
Thanks, Matthias