billpsomas / rscir

Official PyTorch implementation and benchmark dataset for IGARSS 2024 ORAL paper: "Composed Image Retrieval for Remote Sensing"
https://arxiv.org/abs/2405.15587
Apache License 2.0
62 stars 1 forks source link

Confusion about the differences between the paper and the code #1

Closed jian-rookie closed 1 month ago

jian-rookie commented 1 month ago

Hi! This is a nice work. But when I read the paper and review the code, I find inconsistency between the paper and the code. In the paper, composed image retrieval for remote sensing is defined as follows. Given a referennce image (Class A, Attribute $\alpha$) and a modified attribute $\beta$, the task aims to search the target images (Class A, Attribute $\beta$). However, when I reproduce the code, I find that the referennce image and the target images may not belong to the same Class A. For example, given the referennce image(tenniscourt723.jpg, Class tenniscourt, Attribute blue) and the modified attribute (gray), the images of Class nursinghome and Attribute gray are also regraded as the target images in the code implementation. This is not aligned with the task definition in the paper.

Looking forward to the reply.

billpsomas commented 1 month ago

Cap176ture Hello @jian-rookie :)

Thanks a lot for the issue. You're actually right. Initially, the dataset had not the same attribute value (e.g. 'gray') for two different classes, thus it was correct. However, after the extension we made, you will see in the attached Table that the attributes 'color' and 'quantity' have attribute values that are repeated across different classes. For example, as you said, 'gray' is both in nursinghome and tenniscourt. However, each time only the results having the same class as the query, should be counted as correct.

I fixed and pushed the code. Don't hesitate to raise any other issue or ask for anything you might need.

Thanks a lot :)

jian-rookie commented 1 month ago

Thanks for the quick response! It's very helpful.