drj11 / pypng

Pure Python library for PNG image encoding/decoding
MIT License
451 stars 90 forks source link

Question on how to best handle sBIT=10 data #113

Open tvercaut opened 3 years ago

tvercaut commented 3 years ago

Hi,

Thanks for the nice library. I am trying to use it for 10bit images and am wondering what the recommended procedure is. If I read it with read, the data I get is in np.uint16 and rescaled which it seems is the intended default behaviour (http://www.libpng.org/pub/png/book/chapter11.html#png.ch11.div.7). I am interested in getting the raw values though. If I read it with asDirect, the values seem to be the original ones but the type is apparently int64. What am I missing?

Below is a code snippet demonstrating this behaviour:

import png
import numpy as np
import imageio

# Create simple image
bitdepth = 10
im = np.zeros((5,5),dtype=np.uint16)
im[2,2] = (1<<bitdepth)-1
#print(im)

# Write png
metadata = { "bitdepth": bitdepth, }
filename = 'gray10-test.png'
png.from_array(im,'L',metadata).save(filename)

# Write png as 16bits
filename_as16 = 'gray10-test-as16.png'
png.from_array(im,'L').save(filename_as16)

# Read it back
r = png.Reader(filename)
w, h, pixels, metadata = r.read()
print('read',w, h, metadata)
im = np.vstack(list(map(np.asarray, pixels)))
print('read',np.min(im),np.max(im),im.shape,im.dtype)

# Read it back using asDirect
r = png.Reader(filename)
w, h, pixels, metadata = r.asDirect()
print('asDirect',w, h, metadata)
im = np.vstack(list(map(np.asarray, pixels)))
print('asDirect',np.min(im),np.max(im),im.shape,im.dtype)

# Check with imageio
im_io = imageio.imread(filename)
print('imageio',np.min(im_io),np.max(im_io),im_io.shape,im_io.dtype)

im_io16 = imageio.imread(filename_as16)
print('imageio with as16',np.min(im_io16),np.max(im_io16),im_io.shape,im_io.dtype)

im_io_scaled = (im_io >> 16-bitdepth)
print('scaled from imageio',np.min(im_io_scaled),np.max(im_io_scaled),im_io_scaled.shape,im_io_scaled.dtype)

with corresponding output

read 5 5 {'greyscale': True, 'alpha': False, 'planes': 1, 'bitdepth': 16, 'interlace': 0, 'size': (5, 5)}
read 0 65535 (5, 5) uint16
asDirect 5 5 {'greyscale': True, 'alpha': False, 'planes': 1, 'bitdepth': 10, 'interlace': 0, 'size': (5, 5)}
asDirect 0 1023 (5, 5) int64
imageio 0 65535 (5, 5) uint16
imageio with as16 0 1023 (5, 5) uint16
scaled from imageio 0 1023 (5, 5) uint16

On a related note, as a first time user of the library, I looked at the code examples here but stumbled onto some deprecation warning which I silenced by introducing a call to list in between vstack and map when creating the numpy array. Not sure if this is the best way but it might be good to have a fix for this warning in the documentation.

Best wishes, Tom

drj11 commented 3 years ago

One point worth knowing is that in PNG format, the "raw values" as you say are either 8-bit or 16-bit (or, in grayscale cases, 1-, 2-, or 4-). I guess that is raw from the perspective of the PNG format, not from the perspective of whatever the originating imaging device is. So when there is 10-bit data, it's actually 16-bit data with an sBIT that says interpret as 10-bit.

From what you've said it looks like everything is working as document from PyPNG's side, although i agree it may be less than helpful for people using numpy.

It is worth bearing in mind that PyPNG has nothing to do with numpy and as the numpy module offends my sensibilities, it is the result of a delicate compromise that the documentation mentions numpy at all. So we should clear up one misconception: read() is not returning numpy.unit16 values, it is returning ... well a type i would rather not discuss. But see below.

I've now run your code example, very useful. What is happening:

In the first case numpy "cleverly" notices that the values are a Python array.array instance and therefore translates those array values to numpy 16-bit values. In the second case numpy only sees a list of plain int, it therefore cannot know what size the actual numeric values are (Python ints have infinite range and prevision), it therefore conservatively chooses its default int type, which is a 64-bit type.

PyPNG (me, really) is under no obligation to make any guarantees about the exact type of the rows it returns, other than it will be a Python sequence of integer values. Therefore if it choose to return an array.array in some cases and a plain Python list in other cases, then that is fine.

You should not rely on any particular behaviour. So, moving on to how we should deal with this in numpy.

Probably the best thing to do is fix the type on the numpy side. That looks like:

im = np.vstack([np.asarray(row, dtype=np.uint16) for row in pixels])

A couple of further observations:

On memory. it seems foolish of numpy to deprecate using vstack on iterables, it would have been the natural way to efficiently stream values into a numpy array. Converting the rows into a list, as numpy will soon require, means that the entire image will be stored as a list of list of Python ints, even if only temporarily. This will be seriously memory intensive (roughly 2 64-bit words per pixel on a 64-bit platform).

On speed (and memory). It will be faster and more memory efficient (as it happens) to do the scaling in numpy.

row >> 6

will rescale a row.

Thanks for the heads-up on numpy.vstack; why numpy keeps deprecating perfectly good clear code i do not know, but it is a minor source of annoyance every couple of years or so.

tvercaut commented 3 years ago

Thanks for the detailed and clear answer.

Is there a simple way to use the faster (as far as I understood your response) read + scaling back in numpy but know the underpinning bitdepth? Indeed, I would need to know the original bitdepth to know what the scaling factor is. Yet using read led to a bitdepth of 16 in the returned metadata. Using read for the data and asDirect for the metadata seems a bit counter-intuitive to me.

Anyway, it looks like if speed is not really a concern, using asDirect and providing the expected dtype to np.asarray would be a suitable approach.