filipradenovic / cnnimageretrieval-pytorch

CNN Image Retrieval in PyTorch: Training and evaluating CNNs for Image Retrieval in PyTorch
http://cmp.felk.cvut.cz/cnnimageretrieval
MIT License
1.43k stars 323 forks source link

Are query images suppose to be in the database also? #48

Open FrederikWarburg opened 5 years ago

FrederikWarburg commented 5 years ago

Hi, When I look at the test code for oxford5k, it seems that the query images are in the database also. When images for respectively the database and the query are found:

cfg = configdataset(dataset, os.path.join(get_data_root(), 'test'))
images = [cfg['im_fname'](cfg,i) for i in range(cfg['n'])]
qimages = [cfg['qim_fname'](cfg,i) for i in range(cfg['nq'])]

When I run

for im in qimages:
    if im in images:
        print(im)

it seems that all the images are also available in the database. As far as I understand, they are not suppose to be both places? It is probably me who am missing something, so a clarification on this would be appreciated. Thank you!

filipradenovic commented 5 years ago

Full size test images indeed are in the database, but at query time they are cropped before search, so the cropped versions are not exactly in the database. That is how the authors of the original Oxford Buildings dataset paper designed it.

Even so, we do recognize that this is a problem, so we proposed Revisited Oxford and Paris datasets (R-Oxford and R-Paris), in which query images are not part of the database. There are also additional improvements of the datasets, do read the paper for the details.

To use these revisited datasets in this toolbox, use 'roxford5k' and 'rparis6k'.