yuanming-hu / fc4

Code and resources for "FC4 : Fully Convolutional Color Constancy with Confidence-weighted Pooling" (CVPR 2017)
MIT License
190 stars 59 forks source link

Re-implementation in Keras #18

Open hienpham15 opened 6 years ago

hienpham15 commented 6 years ago

Hi, since I'm trying to re-implement your code in Keras (python 3.6), I open this thread for some questions and advice.

  1. You defined your input images as 512x512x3, but then your SqueezeNet take an input of 224x224x3, I'm confused, can you clarify this?

  2. Since you used Adam optimizer, this part of the code which uses SGD as training optimizer is unnecessary, right?

    opt1 = tf.train.AdamOptimizer(self.learning_rate * FINE_TUNE_LR_RATIO)
    opt2 = tf.train.AdamOptimizer(self.learning_rate)
    grads = tf.gradients(self.total_loss, var_list1 + var_list2)
    grads1 = grads[:len(var_list1)]
    grads2 = grads[len(var_list1):]
    train_op1 = opt1.apply_gradients(zip(grads1, var_list1))
    train_op2 = opt2.apply_gradients(zip(grads2, var_list2))
    self.train_step_sgd = tf.group(train_op1, train_op2) 
  3. Did you train your SqueezeNet from scratch or use the wieghts from the pretrained SqueezeNet model ?

  4. When you perform the data augmentation:

Bonus question: Is it necessary to mask out the MCCs? I see no reason behind this also

yuanming-hu commented 6 years ago

Thanks for the good questions. Just some quick answers:

  1. We use SqueezeNet as a fully convolutional network so there is no constraints on input resolution
  2. Yes, we use Adam. We tried SGD but it doesn't give bettter result.
  3. We use the pre-trained model.
  4. We actually found these seemingly unrelated augmentations very helpful. One explanation is that our system largely benefits from semantic understanding and rotation helps here. When relighting the image using color_aug, we "relight" the ground truth illumination as well.

Bonus answer: Yes. I haven't tried not masking the MCCs, but if you keep them you will probably get a bunch of "color checker" detectors, which clearly doesn't generalize to cases where there are no MCCs.

Please let me know if you have more questions.

hienpham15 commented 6 years ago

I have finished implementing your model in Keras framework. Though, I made some adjustments, such as: using VGG16 instead of Squeezenet, dividing the images into patches and train on all those patches... After training for 20 epochs with about 2000 patches (from 200 images) and testing on 160 images, I have the following results:

average_angular_error ~ 1.8 median_angular_error ~ 1.81

It's surprising me that the median is higher than the one in your paper. Also, I noticed that your model (or at least your ideas on my Keras implementation) perform better with indoor scenes (when comparing with CNN from Bianco or Deep Specialized Net from Wu shi). Here is my implementation, would you mind take a look and give me some comments whether I did it right or not? Thank you in advance

yuanming-hu commented 6 years ago

Hello and thanks for the implementation! The adjustments sound reasonable to me, and the achieved average angular error is comparable with our implementation using AlexNet.

However, I'm also surprised that the median error is even higher than the mean error.

It's interesting that our approach performs better on indoor scenes. To be honest I didn't draw this conclusion when doing this project. Thanks for letting me know. One explanation is that indoor scenes contain more noise (textureless walls, light bulbs, varying illumination etc.) with which our approach deals better.

Your implementation looks good (though I'm not very experienced with Keras). Again, the suprising thing is the high median angular error. One thing we can do is to visualize the norm of estimations to see if the confidence values are reasonable.

hienpham15 commented 6 years ago

After reading you Supplementary Material (for the FC4 paper) and the function get_visualization() in your code, I am quite confused about the size of the confidence map as well as the size of the semi-dense feature map.

As I understand,

yuanming-hu commented 6 years ago

Thanks for the questions.

yuanming-hu commented 6 years ago

Btw, 2041x1359 is too large for FC4. I think in my code I downsample it by a factor of two. This actually results in a larger (and semantically more useful) receptive field.

pdaniol commented 5 years ago

Hi! I've just started to learn Keras and I am really interested how this re-implementation looks like. @hienpham15 do you still have your source code? I would be really grateful if you could share it.