IDEA-Research / GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
https://arxiv.org/abs/2303.05499
Apache License 2.0
6.91k stars 700 forks source link

Question about data leak #202

Open Ksky001 opened 1 year ago

Ksky001 commented 1 year ago

Hi, Thanks for your wonderful work!

From the papr [Table 2&5]: When pre-traininig dataset includes corresponding data from test dataset, (e.g. For the final row in Table 2, pre-training dataset includes COCO and test dataset is the COCO. For the final three rows in Table 5, pre-training dataset includes RefCOCO and test dataset is the RefCOCO)

Q1: Does such pre-training has the object or category existing in the test dataset? i.e. Data leak. Q2: What is the difference between zero-shot and finetune in this situation?