CircleRadon / Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Apache License 2.0
744 stars 42 forks source link

How to evaluate the Open-Vocabulary Segmentation results in Table 2? #5

Open Glupapa opened 8 months ago

Glupapa commented 8 months ago

Hi,

Thank you for sharing your impressive work!

I got confused about Table 2: How are the open vocabulary segmentation metrics calculated? Also, could you please explain how Osprey outputs the mask to calculate these metrics?

Thanks for your help!

CircleRadon commented 8 months ago

Hi, @Glupapa For open-vocabulary segmentation, all approaches employ ground truth boxes/masks as input to assess regional recognition capability. We leverage the semantic similarity as the matching measurement to calculate these metrics. We will release the codes for performance evaluation. Actually, the current version of Osprey lacks the capability to generate output masks.

Glupapa commented 8 months ago

Thanks for your prompt response! I noticed that the metrics used on Cityscapes and ADE20K-150 in Table2 are PQ, AP and mIoU, so I'm curious about how to calculate these metrics if Osprey may not output a mask. Could you please shed some light on this? Thank you once again for your assistance.

CircleRadon commented 8 months ago

@Glupapa The groundtruth masks are used in calculating these metrics.