BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.09k stars 18.7k forks source link

why 「scale」 within transform_param of data layer used in MNIST example is not applied in ImageNet example? #5589

Open hgffly opened 7 years ago

hgffly commented 7 years ago

I try to train alexnet over ImageNet. I read its train_val.prototxt, its transform_param is as below:

transform_param { mirror: true crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" }

But in the example of MNIST tutorial, there is a scale parameter to normalize the pixel value from [0, 255] to [0, 1], listed as below:

transform_param { scale: 0.00390625 }

And I tried to add "scale: 0.00390625" in the transform_param in my case, the accuracy becomes very poor.

Is the lmdb data converted by "build/tools/convert_imageset" already normalized? I tried to trace the codes in "tools/convert_imageset" and I found the image is loaded by function "ReadImageToDatum". Then ReadFileToDatum is called in ReadImageToDatum. Finally in ReadFileToDatum, I still don's see any codes normalizing the pixel value.

Why scale: 0.00390625 is only available in MNIST example but not in other examples? If the lmdb data is normalized, when and where has the normalization been done? thanks

liangshuang1993 commented 7 years ago

Hi, have you found out why? Thanks.

noirmist commented 7 years ago

Hi, when I checked convert_imageset.cpp, I can not find normalization parts. I guess, this tools won't normalize the data. And I remember the scale 0.00390625 is the same as 1/256. Hope this clue helps you.

liangshuang1993 commented 7 years ago

http://ufldl.stanford.edu/tutorial/unsupervised/PCAWhitening/ said "In detail, in order for PCA to work well, informally we require that (i) The features have approximately zero mean, and (ii) The different features have similar variances to each other. With natural images, (ii) is already satisfied even without variance normalization, and so we won’t perform any variance normalization."

So I think natural images doesn't need to use normalization, and using normalization will make grads smaller, therefore we may get a worse result.

amitfishy commented 6 years ago

@hgffly The example seems to be the implementation of the AlexNet paper, which does state that it does only mean normalization on the raw RGB pixels. So the question should really be why they are doing it in the AlexNet paper. I find it a bit strange that they do not do any scale normalization, seeing as how Hinton does emphasize the importance of scale normalization in some of his video lectures. Could be one of 2 possibilities:

  1. Scale normalization would actually give faster convergence.
  2. Something else is present in the network architecture that actually gives a similar effect to scale normalization (like perhaps the norm1 and norm2 layers).