billpsomas / rscir

Official PyTorch implementation and benchmark dataset for IGARSS 2024 ORAL paper: "Composed Image Retrieval for Remote Sensing"
https://arxiv.org/abs/2405.15587
Apache License 2.0
60 stars 1 forks source link

Duplicates in the PatternCom Dataset #6

Closed jian-rookie closed 3 weeks ago

jian-rookie commented 3 weeks ago

In the file named quantity.csv, there are nine images that appear twice with different attribute values. For example, wastewaterplant270.jpg is annotated with three and one of attribute Quantity

The names of the nine images are as follows. wastewaterplant270.jpg wastewaterplant450.jpg storagetank020.jpg wastewaterplant271.jpg wastewaterplant272.jpg wastewaterplant327.jpg storagetank225.jpg storagetank520.jpg wastewaterplant269.jpg

billpsomas commented 3 weeks ago

Hello again @jian-rookie,

Thanks for raising this issue. These duplicates are images of high ambiguity, thus it was a design choice to annotate them like this. For example, in most such cases, the whole object is not depicted in the image, thus raising ambiguity. If you spot any other annotation issue, don't hesitate to tell me about it.

jian-rookie commented 3 weeks ago

Thanks for your quick reply. I've got it. @billpsomas When any of the above nine images is used as a query, it will also be used as a target image. However, there are not many of these images, I think it should not have much impact on the evaluation results.