keras-team / keras-preprocessing

Utilities for working with image data, text data, and sequence data.
Other
1.02k stars 444 forks source link

Add functionality to read 32-bit images #242

Closed rragundez closed 5 years ago

rragundez commented 5 years ago

Summary

This PR adds the functionality to read 32-bit images (32-bit signed integer pixel values) without losing information. This is specifically important for medical images as a great deal of information is being lost on the conversion to 8-bit.

The load_img function only affects and leaves 32-bit signed integer format ("I" for PIL modes) untouched.

The array_to_img function now returns a 32-bit signed integer ("I" mode in PIL) if the array has a single channel and the maximum value of any pixel exceeds 255.

One thing to note is that if array_to_img is executed with the argument scale=True then even if the array is in 32-bit format, the array will be normalized to 8-bit and return an 8-bit PIL image (mode "L").

This PR has tests for 32-bit but I also added for grayscale 8-bit specifically.

Related Issues

Closes #239

PR Overview

rragundez commented 5 years ago

CI should pass once #243 is merged and this branch is rebased.

dibenedetto commented 5 years ago

The array_to_img function now returns a 32-bit signed integer ("I" mode in PIL) if the array has a single channel and the maximum value of any pixel exceeds 255.

in general, i think that type cast (e.g., from uint32 to uint8) should NOT be used when converting images. for what i understood from your PR, 32 bit images are automatically converted to an 8 bit ones if the source channel is one and the maximum value is within 0 and 255; given that i don't know the source bit depth, this will alter any normalization factor that i would use for some kind of preprocessing (e.g., floating-point division by 2^8 vs 2^16 vs 2^32, you can think about depth sensors that "sense" a very near surface that will be wrongly reported).

my suggestion is to leave the source integer data type as the closest destination data type possibile, that is, uint8, uint16, uint32.

One thing to note is that if array_to_img is executed with the argument scale=True then even if the array is in 32-bit format, the array will be normalized to 8-bit and return an 8-bit PIL image (mode "L").

i came from the C++ background and scale=True seems a bit ambiguous to me (change type? normalize to what?), so i won't comment on this.

rragundez commented 5 years ago

Thanks for thee feedback @dibenedetto. I think I agree with you about not casting will be the best but there are some things to take into account. First some clarifications by array we mean a numpy array and by image a PIL image. When discussing makes it confusing to call an array an image even if the intent of the array is to represent an image.

The functionality in the utils module is mainly meant to support the different kind of iterators that give the functionality to ImageDataGenerator and flow_from_ methods. As such in order to integrate seemingly with models and matrix multiplications in the backennd (tensorflow for example) the arrays need to be in the type specified by the backend, where the default is float32. This brings us to a predicament, arrays need to be handle internally as float32, so when internally calling array_to_img, a decision needs to be made on which format to convert to an image, since we do not know what the initial intent was (int8, int16, etc.) as it gets lost when having to cast to float32. Then the decision was made to only support 8-bit channels (making this PR an exception).

Your point makes total sense though if the function array_to_img gets call directly by the user, the best indeed would be to make the user responsible for having the array in the correct type.

scale=True just means normalizing values to be between [0-255], which makes sense if everything is handled in 8-bit channels.

You made good points and I'll think a bit more about it, I also think this PR is not in the best shape to address all points yet.

@Dref360 do you have any comments on this?

rragundez commented 5 years ago

I think we can use the dtype argument not only to change the array type but also as a proxy to determine the PIL image mode.

Dref360 commented 5 years ago

I think that array_to_img is only used when saving the image and when we apply brightness augmentation.

Could we infer the correct dtype from the original dtype of the array before we cast it to float32?

rragundez commented 5 years ago

It might be, but then I would argue why the need of changing the dtype at all in this line

Why not determine the PIL image mode directly from the array dtype without having that intermediate casting to dtype (default float32). This idea is basically what @dibenedetto proposes. I like it. I'm not sure if this will break something though, I don't think so, we just need to make sure that array_to_img and img_to_array are congruent, which is what the unit tests do.

rragundez commented 5 years ago

I investigated using dtype to assign the correct img mode but this won't work for people using the ImageDataGenerator. Let's say for example the batch of images for training a model is in an array of type float32, then the user decides to save augmented images to disk, PIL does not handle well float32 types, it will brake or it distorts the images quite a lot when using RGB mode for example. The casting of the array to uint8 avoids this distortion.

An example: Selection_030

Then about grayscale images, if the user has an array of type int16 or int32, it will still work, yes the image will occupy more space if it was originally in int16 but the image will not be distorted. And we cannot infer the mode from the array type when images are coming from ImageDataGenerator as they arrive to this function already in float32 (or whatever backend option was set). The best I could think of is checking the pixel values to determine if to keep 8bit or use 32bit.

So I think this PR addresses the 32-bit problem given the circumstances of the intended use of the functions.

@Dref360 could you have a proper review?

rragundez commented 5 years ago

I added the mentioned tests plus added tests in utils_tests.py and direcdtory_iterator_test.py for 16bit grayscale images. @Dref360

sam6485 commented 3 years ago

I am working with Keras '2.4.3' and I can not load float32 grayscale images with color_mode='rgb' and also color_mode='grayscale' using Image Data Generator. Can you help how I can resolve it?

Dref360 commented 3 years ago

@sam6485 could you create an issue with the error and some steps to reproduce? I'll take a look

sam6485 commented 3 years ago

@Dref360 I created issue #340 few days ago. would you please have a look?