Closed rragundez closed 5 years ago
CI should pass once #243 is merged and this branch is rebased.
The array_to_img function now returns a 32-bit signed integer ("I" mode in PIL) if the array has a single channel and the maximum value of any pixel exceeds 255.
in general, i think that type cast (e.g., from uint32 to uint8) should NOT be used when converting images. for what i understood from your PR, 32 bit images are automatically converted to an 8 bit ones if the source channel is one and the maximum value is within 0 and 255; given that i don't know the source bit depth, this will alter any normalization factor that i would use for some kind of preprocessing (e.g., floating-point division by 2^8 vs 2^16 vs 2^32, you can think about depth sensors that "sense" a very near surface that will be wrongly reported).
my suggestion is to leave the source integer data type as the closest destination data type possibile, that is, uint8, uint16, uint32.
One thing to note is that if array_to_img is executed with the argument scale=True then even if the array is in 32-bit format, the array will be normalized to 8-bit and return an 8-bit PIL image (mode "L").
i came from the C++ background and scale=True seems a bit ambiguous to me (change type? normalize to what?), so i won't comment on this.
Thanks for thee feedback @dibenedetto. I think I agree with you about not casting will be the best but there are some things to take into account. First some clarifications by array we mean a numpy array and by image a PIL image. When discussing makes it confusing to call an array an image even if the intent of the array is to represent an image.
The functionality in the utils
module is mainly meant to support the different kind of iterators that give the functionality to ImageDataGenerator
and flow_from_
methods. As such in order to integrate seemingly with models and matrix multiplications in the backennd (tensorflow for example) the arrays need to be in the type specified by the backend, where the default is float32
. This brings us to a predicament, arrays need to be handle internally as float32
, so when internally calling array_to_img
, a decision needs to be made on which format to convert to an image, since we do not know what the initial intent was (int8, int16, etc.) as it gets lost when having to cast to float32
. Then the decision was made to only support 8-bit channels (making this PR an exception).
Your point makes total sense though if the function array_to_img
gets call directly by the user, the best indeed would be to make the user responsible for having the array in the correct type.
scale=True
just means normalizing values to be between [0-255], which makes sense if everything is handled in 8-bit channels.
You made good points and I'll think a bit more about it, I also think this PR is not in the best shape to address all points yet.
@Dref360 do you have any comments on this?
I think we can use the dtype
argument not only to change the array type but also as a proxy to determine the PIL image mode.
I think that array_to_img is only used when saving the image and when we apply brightness augmentation.
Could we infer the correct dtype from the original dtype of the array before we cast it to float32?
It might be, but then I would argue why the need of changing the dtype
at all in this line
Why not determine the PIL image mode directly from the array dtype without having that intermediate casting to dtype
(default float32
). This idea is basically what @dibenedetto proposes. I like it.
I'm not sure if this will break something though, I don't think so, we just need to make sure that array_to_img
and img_to_array
are congruent, which is what the unit tests do.
I investigated using dtype
to assign the correct img mode but this won't work for people using the ImageDataGenerator
. Let's say for example the batch of images for training a model is in an array of type float32
, then the user decides to save augmented images to disk, PIL does not handle well float32
types, it will brake or it distorts the images quite a lot when using RGB
mode for example. The casting of the array to uint8
avoids this distortion.
An example:
Then about grayscale images, if the user has an array of type int16 or int32, it will still work, yes the image will occupy more space if it was originally in int16 but the image will not be distorted. And we cannot infer the mode from the array type when images are coming from ImageDataGenerator
as they arrive to this function already in float32
(or whatever backend option was set). The best I could think of is checking the pixel values to determine if to keep 8bit or use 32bit.
So I think this PR addresses the 32-bit problem given the circumstances of the intended use of the functions.
@Dref360 could you have a proper review?
I added the mentioned tests plus added tests in utils_tests.py
and direcdtory_iterator_test.py
for 16bit grayscale images. @Dref360
I am working with Keras '2.4.3' and I can not load float32 grayscale images with color_mode='rgb' and also color_mode='grayscale' using Image Data Generator. Can you help how I can resolve it?
@sam6485 could you create an issue with the error and some steps to reproduce? I'll take a look
@Dref360 I created issue #340 few days ago. would you please have a look?
Summary
This PR adds the functionality to read 32-bit images (32-bit signed integer pixel values) without losing information. This is specifically important for medical images as a great deal of information is being lost on the conversion to 8-bit.
The
load_img
function only affects and leaves 32-bit signed integer format ("I" for PIL modes) untouched.The
array_to_img
function now returns a 32-bit signed integer ("I" mode in PIL) if the array has a single channel and the maximum value of any pixel exceeds 255.One thing to note is that if
array_to_img
is executed with the argumentscale=True
then even if the array is in 32-bit format, the array will be normalized to 8-bit and return an 8-bit PIL image (mode "L").This PR has tests for 32-bit but I also added for grayscale 8-bit specifically.
Related Issues
Closes #239
PR Overview