mahmoudnafifi / C5

Reference code for the paper "Cross-Camera Convolutional Color Constancy" (ICCV 2021)
Apache License 2.0
105 stars 19 forks source link

Testing results #14

Open RafiqueA03 opened 10 months ago

RafiqueA03 commented 10 months ago

Hi Mahmoud, For my work, one of my tasks is to reproduce the results mentioned in your paper. For this, I am testing the provided pre-trained model on the INETL-TAU dataset (7022 images) with m=7. Since the images are already black-level subtracted as mentioned on the dataset website, I am directly passing resized PNG images (384×256) to the model along with corresponding .json files (illuminant information). I am also using cross-validation but without the G multiplier for testing. My obtained results are as: Mean: 2.61, Median: 1.77, Best25: 0.57, Worst25: 1.44, Worst05: 2.16, Tri: 1.95, and Max: 28.39.

There is slight variation in results except the Worst25 which has a lot. As per my understanding, one reason could be the random sample selection nature of cross-validation. Is it so? or is there any other important step, I am missing?

Another thing to be mentioned, during the test I didn't mask out the color checker present in the scenes that you mentioned in the paper. Could you please provide details on it, how you did that? Because I think for the masking the coordinates for the color checker in each scene should be known.

mahmoudnafifi commented 10 months ago

Hi, there was a mistake in the evaluation script provided in this repo -- it does not affect the reported results, as the script used for evaluation in the paper does not have this bug. I fixed this bug in the public evaluation code here. Please try again with the updated evaluation script, and let me know if you still encounter any issues with 'Worst25'.

Regarding the second point, yes, we masked out color checker pixels in datasets that include color charts in the scenes. In these datasets, the color chart coordinates are provided by the authors. In Intel-tau case, there are no color charts present in the images, so this step is unnecessary.

RafiqueA03 commented 10 months ago

Hi, Thanks for the update. Now the obtained results are almost the same.

However, there is another question. The provided pre-trained model is trained on data_num (m)=7. Does this mean that for testing the data_num should explicitly be equal to 7? Because if I put data_num less or greater than 7, I get the error "RuntimeError: Error(s) in loading state_dict for network:".

Can you please clarify? should the model be trained and tested on the same value of data_num? For example, in the training section of the paper for m=9, it is mentioned that, "then randomly select eight additional input images for each query image from the training set for use as additional input images".

Also how encoder blocks are being generated. From visualization in the paper, it appears that for m images encoder blocks will be m.

mahmoudnafifi commented 10 months ago

Yes testing should be performed using the same m, which refers to the number of additional histograms randomly selected + the input histogram. m=7 means we use 6 additional histograms. The number of encoders are equal to m.