Open hgffly opened 7 years ago
Hi, have you found out why? Thanks.
Hi, when I checked convert_imageset.cpp, I can not find normalization parts. I guess, this tools won't normalize the data. And I remember the scale 0.00390625 is the same as 1/256. Hope this clue helps you.
http://ufldl.stanford.edu/tutorial/unsupervised/PCAWhitening/ said "In detail, in order for PCA to work well, informally we require that (i) The features have approximately zero mean, and (ii) The different features have similar variances to each other. With natural images, (ii) is already satisfied even without variance normalization, and so we won’t perform any variance normalization."
So I think natural images doesn't need to use normalization, and using normalization will make grads smaller, therefore we may get a worse result.
@hgffly The example seems to be the implementation of the AlexNet paper, which does state that it does only mean normalization on the raw RGB pixels. So the question should really be why they are doing it in the AlexNet paper. I find it a bit strange that they do not do any scale normalization, seeing as how Hinton does emphasize the importance of scale normalization in some of his video lectures. Could be one of 2 possibilities:
I try to train alexnet over ImageNet. I read its train_val.prototxt, its transform_param is as below:
transform_param { mirror: true crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" }
But in the example of MNIST tutorial, there is a scale parameter to normalize the pixel value from [0, 255] to [0, 1], listed as below:
transform_param { scale: 0.00390625 }
And I tried to add "scale: 0.00390625" in the transform_param in my case, the accuracy becomes very poor.
Is the lmdb data converted by "build/tools/convert_imageset" already normalized? I tried to trace the codes in "tools/convert_imageset" and I found the image is loaded by function "ReadImageToDatum". Then ReadFileToDatum is called in ReadImageToDatum. Finally in ReadFileToDatum, I still don's see any codes normalizing the pixel value.
Why scale: 0.00390625 is only available in MNIST example but not in other examples? If the lmdb data is normalized, when and where has the normalization been done? thanks