Haiyang-W / GiT

[ECCV2024 Oral🔥] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"
https://arxiv.org/abs/2403.09394
Apache License 2.0
293 stars 12 forks source link

Implementation detail for COCO mAP calculation #13

Closed peiwang062 closed 1 month ago

peiwang062 commented 1 month ago

First, thanks for sharing the codes and models. I just wonder in this paper, how the COCO mAP metric was calculated under the next token prediction decoder output. Since traditionally, for computing mAP, we need a confidence score prediction for each box, but for next-token prediction output, how this score can be obtained? or am I missing something?

nnnth commented 1 month ago

We use token logits as confidence score. Code details are here

peiwang062 commented 1 month ago

Thanks for the reply. I closed the issue.