BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.03k stars 18.7k forks source link

make grayscale lmdb problem #4966

Open dccho opened 7 years ago

dccho commented 7 years ago

When original images are color images and want to make grayscale lmdb, simple way to do this is converting color images to gray images and call convert_imageset However, I don't want to change color images to gray images and just want to set gray flag to true. In function ReadImageToDatum in io.cpp, ReadImageToCvMat function return grayscale image and third 'if' function return true then ReadFileToDatum function is called. That means even if gray flag is set, color image is used to make lmdb. I think gray flag is not clear

maheriya commented 7 years ago

This bug still exists! Following in ReadImageToDatum should change:

    if (encoding.size()) {
      if ( (cv_img.channels() == 3) == is_color && !height && !width &&
          matchExt(filename, encoding) )

The above code forces 3-channel output if the input image has three channels and basically ignores the is_color = 0 setting by user.

It should change to following to fix the bug (I think):

    if (encoding.size()) {
      if ( ( (cv_img.channels() == 3) && is_color) && !height && !width && 
          matchExt(filename, encoding) )

The reason is that when the input image is color, the imread will convert it to single channel when is_color is set to 0. At that point, the original code (cv_img.channels() == 3) == is_color) will still evaluate to true, causing the original file's data to be used instead of the user's intention of grayscale data. The fix I have shown above works for both grayscale and color. Only drawback is that when the input image is grayscale, the code doesn't take advantage of original file's encoding; however, functionally, it works perfectly.