DrSleep / tensorflow-deeplab-resnet

DeepLab-ResNet rebuilt in TensorFlow
MIT License
1.25k stars 429 forks source link

How can I get IMG_MEAN of a custom dataset? #146

Closed licaizi closed 6 years ago

licaizi commented 6 years ago

dear @DrSleep , if there is a custom dataset, how can i get the IMG_MEAN, thank you

siinem commented 6 years ago

@CaiziLee According to #106, it is computed through the training images.

siinem commented 6 years ago

@DrSleep I just wonder is the mean vector in the form of RGB or BGR ?

licaizi commented 6 years ago

@siinem compute the sum of each channel of the training images and then divided by the count ? but if it's necessary to recompute IMG_MEAN when processing the test samples

DrSleep commented 6 years ago

you can calculate the values for each channel in the training set and use them for both training and inference; whether it is RGB or BGR, depends on the application: here, it is BGR, if I recall correctly

siinem commented 6 years ago

@DrSleep @CaiziLee @arslan-chaudhry I have just computed the mean of pascal voc 2012 training images by matlab. I found that the mean of the 10582 images given at the train.txt file is : [116.6234 111.5128 103.1481] in RGB format at matlab.

The exploited mean value at the published codes is as (it is expected to be computed for the pascal voc 2012 trainign images) : IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32)

How i compute the mean values in matlab is as below:

path_imnames = '.\train.txt';
path_images = '.\VOCdevkit\VOC2012';

fid = fopen(path_imnames);
tline = fgetl(fid);

i=0;
while ischar(tline)
    i = i+1;
    im_name = split(tline," ");
    im_name = im_name{1};

    im = imread(fullfile(path_images,im_name));
    ch1 = im(:,:,1);
    ch2 = im(:,:,2);
    ch3 = im(:,:,3);
    MEAN_IMG(i,:) = mean([ch1(:), ch2(:), ch3(:)],1);
    tline = fgetl(fid);
end
fclose(fid);
MEAN_IMG_ = mean(MEAN_IMG,1)

So, i really wonder how the values of IMG_MEAN at the code are computed?

siinem commented 6 years ago

After searching internet, I found that IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32) is not computed for PASCAL VOC trainign images, but these are the mean values computed for the ImageNet Dataset in BGR format (link ) . For fine_tuning it may be better not to change these values.

dimitriGallos commented 6 years ago

I empirically found that it's better to let IMG_MEAN to be what it is in the code and not change it if you have a small dataset (mine is ~1000 images).

DrSleep commented 6 years ago

usually, segmentation networks are being pre-trained for image classification, hence a particular set of normalisation parameters from image classification tends to be used for segmentation as well. if you train from scratch, it is better to use statistics from your current training set.