Why need CLIP-Aligned Labeling on top of Region Prompting?

tgxs002 / CORA

A DETR-style framework for open-vocabulary detection (OVD). CVPR 2023

Apache License 2.0

166 stars 14 forks source link

Why need CLIP-Aligned Labeling on top of Region Prompting? #13

Closed Jasonkks closed 1 year ago

Jasonkks commented 1 year ago

Dear Authors,

Thanks for the great work.

I understand that the proposed Region Prompting already adapts the CLIP model to match the ground truth classes for base class objects. I am not sure why there is still a need to relabel the classes with 'CLIP-Aligned Labeling'?

Please correct me if my understanding is wrong. Thank you!