tpfister / caffe-heatmap

Caffe with heatmap regression & spatial fusion layers. Useful for any CNN image position regression task.
http://www.robots.ox.ac.uk/~vgg/software/cnn_heatmap
Other
164 stars 99 forks source link

Why do final heatmaps have values ~1, in contrast to gt gaussians (0 to 1.2) ? #14

Closed sunsibar closed 7 years ago

sunsibar commented 8 years ago

A question:

When using the pretrained model and the original heatmap net, the predicted heatmaps contain values that are all very close to 1 (scaling the area 0.95 to 1.05 to the full "imshowable" range leads to nice heatmaps that really look like you'd expect them to, if shown with cv::imshow). But if you look at the "data heatmap" layer, the created label gaussian blobs have values between 0 and 1; ~1 at the peak, decreasing to zero elsewhere. As you would expect of a gaussian.

What did I miss? If the net is trained to predict these label blobs, the final heatmaps' values should be between 0 and 1, not between 0.95 and 1.05. One reason I could imagine is that the training happened with some sort of normalization or scaling layer between label and loss, could that be?

Thanks for any answer! And thank you for sharing your code in the first place.

gregary commented 8 years ago

@su-si data_heatmap.cpp line:656-658 (sigma = 1.5) float gaussian = ( 1 / ( sigma * sqrt(2 * M_PI) ) ) * exp( -0.5 * ( pow(i - y, 2.0) + pow(j - x, 2.0) ) * pow(1 / sigma, 2.0) ); gaussian = 4 * gaussian top_label[label_idx] = gaussian;

I found that the implementation of 2D gaussian distribution is different from the original formula. image

And the max value of label heatmap should be 1.0638.

bazilas commented 7 years ago

"gaussian = 4 * gaussian"

you magnify the gaussian there. it's necessary fine-tuning to accelerate the convergence.

sunsibar commented 7 years ago

Thank you for your answer, and sorry for the long wait. But this doesn't resolve the problem, does it? Why close the topic?

So the implementation has a different scaling factor. But regardless of scaling, one would expect values between 0 and some other value, right? ("the max value of label heatmap should be 1.0638" - so, between 0 and ~1). But there seems to be a bias also, compared to a gaussian.