SysCV / sam-hq

Segment Anything in High Quality [NeurIPS 2023]
https://arxiv.org/abs/2306.01567
Apache License 2.0
3.52k stars 209 forks source link

The ap of lvis measured by the public model is inconsistent with the ap in the paper. #115

Open yaohusama opened 6 months ago

yaohusama commented 6 months ago

When testing the ap of the lvis data set, whether to combine the output of hqsam and the output mask of sam itself to test the ap. Is sam's prediction used to select the box combined with the prediction score on the box? When using vit-det to get the detection frame, is the detection head mask rcnn or cascade rcnn? I use mask rcnn as the detection head, and the ap of hqsam-l is 45.289. The one in the paper is 43.9. Why is my measurement different from the one in the paper? Thanks.

ymq2017 commented 6 months ago

Hi, we use cascade rcnn with this config. And for evaluation, we simply use all pred bbox as prompt without combining score or using output mask as another prompt.