Show-han / Zeroshot_REC

Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)
Apache License 2.0
19 stars 0 forks source link

Incorrect reproduction results #7

Closed iron-wo closed 3 weeks ago

iron-wo commented 3 weeks ago

I want to reproduce the results of the method in the paper using CLIP without fine-tuning,as shown in the figure below 图片

I use the following command """ python eval_refcoco/main.py --input_file /REC/Zeroshot_REC/data/reclip_data/input_file/refcocog_val.jsonl --image_root /REC/Zeroshot_REC/data/train2014 --method matching --clip_model ViT-B/32 --triplets_file /REC/Zeroshot_REC/data/reclip_data/triplets_file/gpt_refcocog_val.jsonl --detector_file /REC/Zeroshot_REC/data/reclip_data/detector_file/refcocog_dets_dict.json """ The results obtained are different from those in the paper 图片

Is there something wrong with the command I ran?

Show-han commented 3 weeks ago

I think you should add --rule_filter command.