filipradenovic / cnnimageretrieval

CNN Image Retrieval in MatConvNet: Training and evaluating CNNs for Image Retrieval in MatConvNet
http://cmp.felk.cvut.cz/cnnimageretrieval
MIT License
190 stars 54 forks source link

fix reading of images with dots in the filename #11

Open carandraug opened 2 years ago

carandraug commented 2 years ago

If an image has a dot in the filename, and the filename is listed on the groundtruth imlist without the file extension, then its filepath is incorrectly constructed without the file extension. The issue is in config_imname and `config_qimname):

function fname = config_imname (cfg, i)
%----------------------------------------------------
  [~, ~, ext] = fileparts(cfg.imlist{i});
  if isempty(ext)
    fname = sprintf ('%s/jpg/%s%s', cfg.dir_data, cfg.imlist{i}, cfg.ext);
  else
    fname = sprintf ('%s/jpg/%s', cfg.dir_data, cfg.imlist{i});
  end

If the filename in imlist has a dot and no file extension, for example Henry_Moore_Three_Way_Piece_No._2_0000, then fileparts returns _2_0000 as file extension, leading the code to construct the filepath without the file extension.

I don't think it's possible to use fileparts to decide on whether a string has or has not a file extension. fileparts is useful to split the filepath in its components but one needs to know a priori if we're dealing only with the basename or the filename.

There's two ways to fix this. Either we expect imlist to be the filename (with file extension) or the basename (without file extension). Since the groundtruth files provided by the project for oxford5k and paris6k do not include the file extension in imlist, I'm assuming that is preferred. This commit removes the logic that handles imlist and qimlist with the file extensions, thus making a requirement for groundtruth files to list the basename only.