Add support for high bit depth multichannel images

wiredfool commented 8 years ago

Pillow (and PIL) is currently able to open 8 bit per channel multi-channel images (such as RGB) but is able to open higher bit depth images (e.g. I16, I32, or Float32 images) if they are single channel (e.g., grayscale).

Previous References

This has been requested many times: #1828, #1885, #1839, #1602, and farther back.

Requirements

We should be able to support common GIS formats as well as high bit depth RGB(A) images.
At least 4 channels, but potentially more (see #1839)
Different pixel formats, including I16, I32, and Float.
There should be definitions for the array interface to exchange images with numpy/scipy
There should be enough support to read and write TIFFs and raw image data.
Support for resize, crop, and convert operations at the very least.
Background Reference Info

The rough sequence for image loading is:

Image file is opened
Each of the ImagePlugin _accept functions have a chance to look at the first few bytes to determine if they should attempt to open the file
The *ImagePlugin._open method is called giving the image plugin a chance to read more of the image and determine if it still wants to consider it a valid image of it's particular type. If it does, it passes back a tile definition which includes a decoder and an image size.
If there is a successful _open call, at some point later *ImagePlugin._load may be called on the image, which runs the decoder producing a set of bytes in a raw mode. This is where things like compression are handled, but the output of the decoder is not necessarily what we're storing in our internal structures.
The image is unpacked (Unpack.c) from the raw mode (e.g. I16;BS) into a storage (Storage.c) mode (I).
It's now possible to operate on the image (e.g. crop, pixel access, etc)

There are 3 (or 4) image data pointers, as defined in Imaging.h:

struct ImagingMemoryInstance {

    /* Format */
    char mode[IMAGING_MODE_LENGTH]; /* Band names ("1", "L", "P", "RGB", "RGBA", "CMYK", "YCbCr", "BGR;xy") */
    int type;       /* Data type (IMAGING_TYPE_*) */
    int depth;      /* Depth (ignored in this version) */
    int bands;      /* Number of bands (1, 2, 3, or 4) */
    int xsize;      /* Image dimension. */
    int ysize;

    /* Colour palette (for "P" images only) */
    ImagingPalette palette;

    /* Data pointers */
    UINT8 **image8; /* Set for 8-bit images (pixelsize=1). */
    INT32 **image32;    /* Set for 32-bit images (pixelsize=4). */

    /* Internals */
    char **image;   /* Actual raster data. */
    char *block;    /* Set if data is allocated in a single block. */

    int pixelsize;  /* Size of a pixel, in bytes (1, 2 or 4) */
    int linesize;   /* Size of a line, in bytes (xsize * pixelsize) */

    /* Virtual methods */
    void (*destroy)(Imaging im);
};

The only one that is guaranteed to be set is **image, which is an array of pointers to row data.

Changes Required

Definitions for all of the modes that we're planning, and potentially a [format];MB[#bands] style generic mode.
Core Imaging Structure
The imaging structure has the fields required to add the additional channels. (type, bands, pixelsize, linesize)
The **image pointer can be used for any width of pixel.
We may or may not want to set the **image32 pointer.
Currently type of IMAGING_TYPE_INT32 and IMAGING_TYPE_FLOAT32 imply 1 band. This will change.
Consider promoting int16 to IMAGING_TYPE_INT16
Storage
Updates to Storage.c, Unpack.c, Pack.c, Access.c, PyAccess.py, and Convert.c
Ways to Help

We need a better definition of the format requirements. What are the various types of images that are used in GIS, Medical, or other fields that we'd want to interpret? We need small, redistributable versions of images that we can test against.

[in progress]

terramars commented 8 years ago

I'm having the same problem with 16 bit single-channel paletted TIFFs, created by GDAL. It would be "really" nice if Pillow could play nicely with GIS and scientific image formats, as GDAL is a pain in the ass and I'd rather not use it.

tiffinfo as follows:

TIFFReadDirectory: Warning, Unknown field with tag 33550 (0x830e) encountered. TIFFReadDirectory: Warning, Unknown field with tag 33922 (0x8482) encountered. TIFFReadDirectory: Warning, Unknown field with tag 34735 (0x87af) encountered. TIFFReadDirectory: Warning, Unknown field with tag 34737 (0x87b1) encountered. TIFFReadDirectory: Warning, Unknown field with tag 42113 (0xa481) encountered. TIFF Directory at offset 0x34293c6 (54694854) Image Width: 10774 Image Length: 12577 Bits/Sample: 16 Sample Format: unsigned integer Compression Scheme: LZW Photometric Interpretation: palette color (RGB from colormap) Samples/Pixel: 1 Rows/Strip: 1 Planar Configuration: single image plane Color Map: (present) Tag 33550: 4.999617,4.999789,0.000000 Tag 33922: 0.000000,0.000000,0.000000,679006.067110,9955209.915048,0.000000 Tag 34735: 1,1,0,7,1024,0,1,1,1025,0,1,1,1026,34737,22,0,2049,34737,7,22,2054,0,1,9102,3072,0,1,32736,3076,0,1,9001 Tag 34737: WGS 84 / UTM zone 36S|WGS 84| Tag 42113: 0 Predictor: horizontal differencing 2 (0x2)

bodokaiser commented 7 years ago

Any updates on this?

wiredfool commented 7 years ago

Unfortunately, no.

vfdev-5 commented 6 years ago

@wiredfool what do you think about to add the support of multichannel images as sequence of Image ? For example, 4 channels image with uint16 is represented (more less equivalently) by ['<PIL.Image.Image image mode=I;16 size=... >', '<PIL.Image.Image image mode=I;16 size=...>', ..., '<PIL.Image.Image image mode=I;16 size=...>']. I mean by that, maybe, to provide a class inheriting from Image and tuple and override all method to work on a tuple of images... Sure that it looks like a hack, however it could unlock more features (and create issues :) ) at least while working with Image.fromarray.

wiredfool commented 6 years ago

To do anything useful with it, we'd have to have support in the C layer, so it would have to be at the core imaging layer, and especially Unpack/Pack.

vfdev-5 commented 6 years ago

@wiredfool following your "Ways to help",

We need a better definition of the format requirements. What are the various types of images that are used in GIS, Medical, or other fields that we'd want to interpret?

For GIS, as there is a huge amount of different formats (for example, gdal format list), this can be left for GIS libraries as gdal, rasterio etc. However, a support of Image.fromarray on input multi-channel (3,4,5,...) arrays of dtype np.uint16, np.float32 would be, imho, essential.

We need small, redistributable versions of images that we can test against.

For GIS imagery, this can be easily created manually with gdal, rasterio.

I would like to give a hand on this, so, feel free to ask me.

edowson commented 6 years ago

PIL cannot handle processing multi-channel images. They get truncated to 3-ch images if you perform any transformation using PIL. #3160

akinuri commented 6 years ago

bjtho08 commented 5 years ago

What is the status of this issue? It has been almost three years since the first proposal. I am unfortunately unable to provide any help since I have zero experience with coding in C, but I am among the people that is awaiting support for e.g. multi-channel floating-point images (with possibilities for negative pixel values). This especially useful in deep learning, where it is preferable to have all values normalized with zero mean. PIL has some really awesome ImageOps, which is one of the reasons for wanting this support.

hugovk commented 5 years ago

@bjtho08 No updates.

https://github.com/python-pillow/Pillow/issues/2485 links to a multipage RGB TIFF containing float64 values.

omaghsoudi commented 5 years ago

Please fix the issue with multi-channel 16 bit images. Thank you!

rbavery commented 4 years ago

I'm closing my other issue since I realize it is a duplicate of this one. Here is an example multichannel tiff dataset to work with for testing once folks get around to tackling this. It's publicly available data from NASA and the USGS: https://ucsb.box.com/s/taz9fb3rcur1d24bt6s7g6cw2ynkw747

Link to my issue with details: https://github.com/python-pillow/Pillow/issues/3984

Thanks for tracking this and all the hard work, I appreciate it!

icml-compbio commented 4 years ago

I can't even open 3 channel tif with PIL Image.open....

radarhere commented 4 years ago

@icml-compbio please open a new issue with more details about your problem, including the image that is failing for you

Conchylicultor commented 3 years ago

Any update on this ? It seems to be a quite common issue. Currently multi-channels np.uint16 are not supported.

KeygenLLC commented 2 years ago

I'm also surprised that float32 RGB or RGBA files are not supported. These are standard for VFX and post production when using a linear workflow, have been for over a decade, and should be supported, whether it's with TIFF files or OpenEXR. They need to be supported and not clamp values to 0-1 unless we choose for them to be. uint8 and even 16 does not suffice as they have hard limits and less precision. Currently Pillow sees their shapes as (1,1,3) regardless of dimensions.

I would also encourage you to make it possible to easily save layered TIFF and EXR files that can be read by Adobe apps, like Photoshop and After Effects, and other industry standard tools used for compositing. This is where these file formats excel and why EXR was created. There is no library available that makes this easy as far as I've been able to find and the libraries that do it require you to manually setup the tags, which is not trivial unless you have knowledge of how this low-level stuff works. Seems a bit much to have to work out byte code to save discrete file layers in 2021 that Photoshop can read.

I've turned to imageio and tifffile, and other libraries made for geo data, and while they support float RGB without clamping, they don't spit out layered files, only multipage, which Adobe and other host apps do not support. I'm still banging away trying to get tags to work.

It's honestly very strange to me that Pillow does not support these things since it's the primary image library everyone uses and it has a nice short syntax and a lot of features that make it great.

And yes, I blame Adobe for using bizarre tagging, but it's what many of us have to work with to get the job done and/or stay employed, and we need layers and float32 RGB.

meson800 commented 1 year ago

Thanks for the work so far on this issue. Here's another datapoint on possible weird bit-depth formats that comes up in some of the microscopy data we handle.

The instrument we use outputs false-colored "grayscale" images that have asymmetric channel bitdepths (trimmed imagemagick output):

Format: TIFF (Tagged Image File Format)
  Mime type: image/tiff
  Geometry: 1920x1440+0+0
  Colorspace: sRGB
  Type: TrueColor
  Endianness: LSB
  Depth: 16-bit
  Channel depth:
    Red: 1-bit
    Green: 16-bit
    Blue: 1-bit

This is detected by Pillow/TiffImagePlugin as having symmetric bitdepths:

(II, 2, (1,), 1, (16, 16, 16), ()): ("RGB", "RGB;16L"),

I can upload one of these images if it is helpful.

This file gets truncated to 8-bit as above with our Python pipelines and other tools like CellProfiler which use Pillow. We've avoided the issue in our pipeline by having an ImageMagick preprocessing step prior to Pillow-dependent steps.

I unsuccessfully tried tracking down why the TiffImagePlugin is loading it symmetrically, but it's kind of moot anyway until full loading of high bit depth multichannel images is there anyway. Technically, I think these could be loaded by having a special rawtype that loaded the 16-bit channel into the 32-bit buffer, but that would require assumptions like dropping the (in this case, uniformly-zero) 1-bit channels.

cgohlke commented 1 year ago

I can upload one of these images if it is helpful.

Please do.

meson800 commented 1 year ago

I can upload one of these images if it is helpful.

Please do.

.tif isn't an allowable upload so here's a .zip of a .tif asymmetric_bit_depth.zip

cgohlke commented 1 year ago

The ImageMagick output is a little confusing. The first image in the file is a simple 3 samples RGB image with 16 bit per sample , i.e. the BitsPerSample tag value is (16, 16, 16), not (1, 1, 16). The fact that two channels only contain zero values shouldn't concern the TIFF reader.

99991 commented 1 year ago

Here is a test case for a 16 bit PNG image generated with GIMP.

import io
import base64
from PIL import Image

# Create a PIL.Image from a base64-encoded string
image = Image.open(io.BytesIO(base64.b64decode("""
iVBORw0KGgoAAAANSUhEUgAAAAcAAAACEAYAAADEDxojAAAAQ0lEQVQI10WMWw0AIBRCz91MYBcD
WMI89LGCFcyEH1cnG48PABvA/gQ7PVOCC0mS7PLLETakQq2tjQFzrrX3u9Cd9zg0Ai9H03VKQwAA
AABJRU5ErkJggg==""")))

expected_image_data = [
    # First row
    (0xffff, 0x0000, 0x0000, 0xffff), # R
    (0x0000, 0xffff, 0x0000, 0xffff), # G
    (0x0000, 0x0000, 0xffff, 0xffff), # B
    (0x0000, 0x0000, 0x0000, 0xffff), # Black
    (0xffff, 0xffff, 0xffff, 0xffff), # White
    (0x0000, 0x0000, 0x0000, 0x0000), # Transparent
    (0x8080, 0x8080, 0x8080, 0xffff), # Gray
    # Second row
    (0xffff, 0xffff, 0x0000, 0xffff), # Yellow
    (0xffff, 0x0000, 0xffff, 0xffff), # Fuchsia
    (0x0000, 0xffff, 0xffff, 0xffff), # Cyan
    (0x1212, 0x3434, 0x5656, 0xffff), # Darkish blue
    (0xaaaa, 0xbbbb, 0xcccc, 0xffff), # Grayish blue
    (0xffff, 0xffff, 0xffff, 0x8000), # White 50 % transparency
    (0xffff, 0xffff, 0xffff, 0x4000), # White 25 % transparency
]

assert image.mode == "RGBA"
assert image.size == (7, 2)
assert list(image.getdata()) == expected_image_data

The test currently passes only if the image data is truncated to 8 bits:

# Truncate to 8 bits as long as Pillow does not support 16 bit PNGs
expected_image_data = [tuple(x >> 8 for x in px) for px in expected_image_data]

OpenCV and PyPNG can load this file and decode it to the expected image data.

You can write the image file to disk with the following bash command:

echo 'iVBORw0KGgoAAAANSUhEUgAAAAcAAAACEAYAAADEDxojAAAAQ0lEQVQI10WMWw0AIBRCz91MYBcDWMI89LGCFcyEH1cnG48PABvA/gQ7PVOCC0mS7PLLETakQq2tjQFzrrX3u9Cd9zg0Ai9H03VKQwAAAABJRU5ErkJggg==' | base64 -d > '16_bit_rgba.png'

Yay295 commented 1 year ago

I think the ImagingMemoryInstance struct needs to store the size of each band. I think that's what depth is for, but it's not currently being used. Every band would have to be the same size to work with our current code, so we would probably want to scale everything to the largest band in the image. I don't see a way to mix integer and floating point bands in the same image, so we would have to pick one and convert the rest.

Another thing that would be useful is to store the index of the alpha band, if there is one. Some of the code needs to know which band this is, and currently they figure this out based on the image mode. Having it as a property of the image would simplify that.

herronelou commented 1 year ago

Just here to echo @KeygenLLC 's comment above, RGB/RGBA images in 16 or 32-bit float is nearly mandatory for VFX, and we've been removing any usage of PIL we can find in tools we bring in, in favor of dealing directly with np.arrays and cv2 functions for manipulating the data as images. It's not as convenient as what PIL offers but 8bit is a deal breaker.

(We're also working with the EXR file format)

fxthomas commented 4 months ago

Focusing specifically on the mode definition:

char mode[IMAGING_MODE_LENGTH]; /* Band names ("1", "L", "P", "RGB", "RGBA", "CMYK", "YCbCr", "BGR;xy") */

Definitions for all of the modes that we're planning, and potentially a [format];MB[#bands] style generic mode.

Do we have a complete list of all supported modes strings (essentially the grammar) that would support all use cases? Something like saying (this may be wrong, this is just how I'm interpreting the earlier discussions):

[format] can be U8, I16, U16, F32 etc (pixel format of a single channel).
[bands] is optional and specifies the channel names/interpretation (RGB, YCbCr etc). If it's not specified I would assume it's just grayscale.

As for the prior art, ffmpeg (which I'm more familiar with) has AVPixFmtDescriptor which handles the memory layout for all their use cases ; the equivalent of "modes" are then defined as the av_pix_fmt_descriptors static array. Is this sort of mechanism something that would be useful to reuse? Do we need to support packed/planar formats or half-resolution chroma planes?

What about extra embedded images that may have different dimensions e.g. thumbnails or auxilliary depth/gain map/matte images? Should they be supported at all, and if so how?

Yay295 commented 4 months ago

the equivalent of "modes" are then defined as the av_pix_fmt_descriptors static array

The closest thing to that in Pillow I think would be this:

https://github.com/python-pillow/Pillow/blob/274924e64f8b53f46d04b122fe5d959f848a99b0/src/libImaging/Storage.c#L44-L225

aclark4life commented 3 months ago

Maybe can make some progress on this in 2024, pending acceptance of https://github.com/AcademySoftwareFoundation/tac/issues/631

aclark4life commented 2 months ago

in favor of dealing directly with np.arrays and cv2 functions for manipulating the data as images. It's not as convenient as what PIL offers but 8bit is a deal breaker.

@herronelou Can you (or anyone?) say any more about the convenience of PIL and how meaningful > 8 bit multichannel support in PIL would be? Would you switch back to PIL if this feature were added and would you expect an uptick in usage from VFX studios in general? I got interested in VFX recently so I'm especially curious about this issue now.

terramars commented 2 months ago

I can just say that for GIS, if you want to deal with tiffs that aren't extremely simple you're stuck going into gdal internals to do anything, even just read them into an array. I'm still sad 7 years later I had to waste time learning that tool and couldn't just do Image.open on them. Maybe someone else implemented it by now but I doubt.

On Mon, Apr 15, 2024, 3:44 PM Jeffrey A. Clark @.***> wrote:

in favor of dealing directly with np.arrays and cv2 functions for manipulating the data as images. It's not as convenient as what PIL offers but 8bit is a deal breaker.

@herronelou https://github.com/herronelou Can you (or anyone?) say any more about the convenience of PIL and how meaningful > 8 bit multichannel support in PIL would be? Would you switch back to PIL if this feature were added and would you expect an uptick in usage from VFX studios in general? I got interested in VFX recently so I'm especially curious about this issue now.

— Reply to this email directly, view it on GitHub https://github.com/python-pillow/Pillow/issues/1888#issuecomment-2057935869, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHA64K6XBSEB6M3GBVL2ZTY5RJ6JAVCNFSM4CDASTI2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBVG44TGNJYGY4Q . You are receiving this because you commented.Message ID: @.***>

herronelou commented 2 months ago

@aclark4life For the most part, VFX studios tend to work with EXR file formats. Internally most of our softwares process in 32bit float, although saving the resulting images in 16bit float is usually enough, except for a small number of specific data passes that we tend to store in other channels.

I've not been doing much personally recently that could have used PIL, the main cases I've run into when I posted were external tools we brought into our pipeline that used PIL for their image reading, and we had to strip it away so we could run our 16bit float images through without the loss caused by going through 8bit, so yes, absolutely, if PIL supported those natively we wouldn't need to go out of our way to strip PIL away when somebody uses it, which would be great.

rbavery commented 2 months ago

PIL is used in many ML frameworks for reading images, like FastAI and detectron2 and countless ML projects. When someone tries to use these frameworks or projects as examples with their high bit depth multichannel images, often the first thing to cause grief is this issue. On multiple occasions I've had to rewrite image data loaders for ML because Pillow does not support multichannel float32 tifs. This imagery is really common in geospatial analysis, most satellite imagery comes in high bit depth.

aclark4life commented 2 months ago

@cgohlke Does any of your code here potentially help us by way of example to implement high bit depth multichannel in Pillow? https://github.com/cgohlke/tifffile/blob/master/tifffile/_imagecodecs.py

Thanks for any info

aclark4life commented 1 month ago

Via @wiredfool , thanks!

I think that there's a good argument for planar image storage, i.e. r/g/b in separate arrays. Any single band calculation would just work, and the more complicated modes (e.g., channels with different bit depth) would be trivial to add, as they would essentially just be part of a list of planes.It would complicate the shufflers, and especially those image formats that currently just splat into an array without using the packer/unpacker. It's also less useful for luminance style calculations, though it's possible. There's definitely a tension in image formats on the interleaved vs planar approach, and I suspect it comes down to "one is easier for basic images, and one is more general.
I think there's a super strong argument for being able to have our storage be directly compatible with the arrow memory layout. I'm unclear if we could have arbitrary structs there, if we'd just want a linear array of one datatype, or if we'd want to do a tensor layout, or what the mechanics are for a dataframe style interop. Arrow + the evolution of the array interface would give us 0 copy interaction with polars/pandas2 and anything else in the new data space.
I think that interleaved storage with anything more than 1|3|4 channel x [list of pixel storage modes] is going to be a pain.
GIS is going to be a pain. I'd still recommend using gdal backed (e.g. rasterio) readers/writers for that, as we've got 0 support for pyramids, spatial metadata, and tiled tiffs. It's a huge field, and we're not even at square 1 for it.

So looking at that, I think there's two definite possibilities for progress.

1) Planar Image Storage, in parallel with the current interleaved image storage. There's probably a couple of core bits here that would need to be in C, but most could probably be done at the Image.py layer. 2) Arrow as a core storage interface. This is going to be all c, with a very small shim for the dataframe interface.

aclark4life commented 1 month ago

Also possibly of interest: https://github.com/girder/large_image

wiredfool commented 1 month ago

FWIW, some references on Arrow.

https://arrow.apache.org/docs/format/Columnar.html The columnar format
https://arrow.apache.org/docs/format/Other.html A reference to the tensor arrangement.
https://arrow.apache.org/docs/python/interchange_protocol.html Dataframe interchange protocol.
https://github.com/apache/arrow-nanoarrow Nano-arrow, a very small implementation of the arrow layout
https://arrow.apache.org/docs/python/index.html PyArrow. I'm not sure we'd want to pull this in as a dependency, but it's a full set of bindings against the C++ Arrow interface.

aclark4life commented 1 month ago

Can anyone suggest some test data we can use to develop this feature? This event is happening tomorrow and would be nice to have a success target in mind e.g. "If we can read/write this type of data …" https://www.meetup.com/dcpython/events/301086016/

rbavery commented 1 month ago

I think that interleaved storage with anything more than 1|3|4 channel x [list of pixel storage modes] is going to be a pain.

In case it isn't too much pain to work with more than 4 bands, we host this example subset of Eurosat, here is an example image s3://wherobots-examples/data/eurosat_small/Highway/Highway_1.tif.

Each image is 13 bands, uint16, planar

>>> tiff_image = tifffile.TiffFile("Highway_1.tif")
>>> print(tiff_image.pages[0].tags['PlanarConfiguration'].value)
PLANARCONFIG.CONTIG

aclark4life commented 1 month ago

@wiredfool If we use Arrow that implies adding a dependency on pyarrow, ideally optionally via extras like pip install pillow[arrow], correct?

wiredfool commented 1 month ago

@aclark4life Maybe. There's definitely a C-only implementation (nanoarrow) that might be what we want, since all of our image allocations are in the C layer now. PyArrow might be easier for integration/interop at the high level, but my sense here is that it wouldn't necessarily be giving us a whole lot that we'd not already have with a C arrow implementation + our usual set of accessors.

aclark4life commented 2 weeks ago

Folks interested in this issue, please test #8157 and give feedback, thanks all

python-pillow / Pillow