Open machengcheng2016 opened 6 years ago
More specifically, i'd like to say, if the range of pixel value in label images is [0,1,2,3,4,5], the images look very dark, since the max value is 5 and far from the bright 255. But when caffe trys to load label images, the pixel values are exactly 0, 1, 2, 3, 4, 5, which is correct for training! If the range of pixel value in label images is [0,51,102,153,204,255], the images are easy to distinguish as for human eyes. And perhaps the caffe internal mechanism will scale all the [0,255] int images to [0, 1] double images?
If you have N lables, label_value should be 0,1,2,...,N-1. In my opinion, the matter is not about transfer mechanism. Perhaps the reason is that in the for loop when calculating accuracy is like this for (int i = 0; i < N; ++i)
. Maybe you can check the accuracy_layer.cpp of Caffe for more details.
Excuse me, it's me again. I wonder how to prepare the label for the training process? In the train_ern.prototxt, I see the "label" blob is loaded from "/media/lh/D/Data/Part1_DB/train/tag", and the layer type is "Data", which means you have already converted those "label.png" images into LMDB format using tools like "create_imagenet.sh". Right? So my only question is, what is the range of pixel value in label images like? [0,1,2,3,4,5] or [0,51,102,153,204,255]? Thank you!