SimonRaviv / VAE-QA

Assessing Image Quality Using a Simple Generative Representation
2 stars 0 forks source link

Data Leakage Issue #1

Closed luke-ck closed 3 months ago

luke-ck commented 3 months ago

Hi guys,

first of all, great job on the model architecture, your approach is interesting.

However, running the evaluation based on your splits, I noticed you did not take care to only take the unique reference images from LIVE (10 out of 29), so there are images from the train set that end up in the test set. For example, you train with i05 from TID (bikes image in LIVE) and test on the same bike image with the same distortion. I think the better approach would be to treat the overlap like in here.

SimonRaviv commented 3 months ago

Hi @luke-ck,

Thanks!

We do make sure that each dataset split by reference images. Do you mean it will be good to make sure the same image is not available in 2 different dataset train/test sets?

luke-ck commented 3 months ago

Hi,

Yes, since the reference image content is the same, overlapping reference images should be assigned to one dataset or the other. You could assign them to TID2013 for example, so that you do not test on reference images seen during training.

SimonRaviv commented 3 months ago

I see.

We don't do this kind of separation on cross-dataset evaluation since it won't be comparable to other papers.

luke-ck commented 3 months ago

Alright I will close this issue then.