The ap of lvis measured by the public model is inconsistent with the ap in the paper.

SysCV / sam-hq

Segment Anything in High Quality [NeurIPS 2023]

https://arxiv.org/abs/2306.01567

Apache License 2.0

3.73k stars 224 forks source link

The ap of lvis measured by the public model is inconsistent with the ap in the paper. #115

Open yaohusama opened 11 months ago

yaohusama commented 11 months ago

When testing the ap of the lvis data set, whether to combine the output of hqsam and the output mask of sam itself to test the ap. Is sam's prediction used to select the box combined with the prediction score on the box? When using vit-det to get the detection frame, is the detection head mask rcnn or cascade rcnn? I use mask rcnn as the detection head, and the ap of hqsam-l is 45.289. The one in the paper is 43.9. Why is my measurement different from the one in the paper? Thanks.

ymq2017 commented 11 months ago

Hi, we use cascade rcnn with this config. And for evaluation, we simply use all pred bbox as prompt without combining score or using output mask as another prompt.