Open hienpham15 opened 6 years ago
Thanks for the good questions. Just some quick answers:
color_aug
, we "relight" the ground truth illumination as well.Bonus answer: Yes. I haven't tried not masking the MCCs, but if you keep them you will probably get a bunch of "color checker" detectors, which clearly doesn't generalize to cases where there are no MCCs.
Please let me know if you have more questions.
I have finished implementing your model in Keras framework. Though, I made some adjustments, such as: using VGG16 instead of Squeezenet, dividing the images into patches and train on all those patches... After training for 20 epochs with about 2000 patches (from 200 images) and testing on 160 images, I have the following results:
average_angular_error ~ 1.8 median_angular_error ~ 1.81
It's surprising me that the median is higher than the one in your paper. Also, I noticed that your model (or at least your ideas on my Keras implementation) perform better with indoor scenes (when comparing with CNN from Bianco or Deep Specialized Net from Wu shi). Here is my implementation, would you mind take a look and give me some comments whether I did it right or not? Thank you in advance
Hello and thanks for the implementation! The adjustments sound reasonable to me, and the achieved average angular error is comparable with our implementation using AlexNet.
However, I'm also surprised that the median error is even higher than the mean error.
It's interesting that our approach performs better on indoor scenes. To be honest I didn't draw this conclusion when doing this project. Thanks for letting me know. One explanation is that indoor scenes contain more noise (textureless walls, light bulbs, varying illumination etc.) with which our approach deals better.
Your implementation looks good (though I'm not very experienced with Keras). Again, the suprising thing is the high median angular error. One thing we can do is to visualize the norm of estimations to see if the confidence values are reasonable.
After reading you Supplementary Material (for the FC4 paper) and the function get_visualization()
in your code, I am quite confused about the size of the confidence map as well as the size of the semi-dense feature map.
As I understand,
The 'fc2 layer' is also the semi-dense feature map, isn't it?
Also, if the above speculation is true, the output size of the 'fc2 layer' is (w/32)x(h/32)x(-1), which is relatively small even when input with big size images (2041x1359); hence, the feature map size is nowhere near the target_shape
(512x512). Have I been missing something or the size is really that small?
And where did you get the value for color_thresh = [250, 500, 750, 1000]
?
Thanks for the questions.
fc2
should be the so-called semi-dense
feature mapBtw, 2041x1359 is too large for FC4. I think in my code I downsample it by a factor of two. This actually results in a larger (and semantically more useful) receptive field.
Hi! I've just started to learn Keras and I am really interested how this re-implementation looks like. @hienpham15 do you still have your source code? I would be really grateful if you could share it.
Hi, since I'm trying to re-implement your code in Keras (python 3.6), I open this thread for some questions and advice.
You defined your input images as 512x512x3, but then your SqueezeNet take an input of 224x224x3, I'm confused, can you clarify this?
Since you used Adam optimizer, this part of the code which uses SGD as training optimizer is unnecessary, right?
Did you train your SqueezeNet from scratch or use the wieghts from the pretrained SqueezeNet model ?
When you perform the data augmentation:
This will change the original ground truth illumination wthout retaining the von Kries ratio, so my question is why? What's your intuition behind this?
Bonus question: Is it necessary to mask out the MCCs? I see no reason behind this also