10-bit color channels in PNG

ProgramMax commented 3 years ago

PNG currently supports bit depths of 1, 2, 4, 8, and 16 per channel. This is specified in the IHDR chunk: https://www.w3.org/TR/PNG/#11IHDR

HDR commonly makes use of 10 bits per channel. Should we consider specifying a 10-bit/channel addition?

Thought dump:

If someone were to take a 10-bit/channel image and put it in a 16-bit/channel image, there would be occasional skips in values as it gets scaled out. This might not be much of an issue, but it is worth considering. It might cause banding. But at 16 bit, I am not sure those skips are noticeable.
Hardware support might not be much of an issue. GL_RGB10_A2 and GL_RGBA16 are both required as of OpenGL 3.0+, for example.
If we force 16-bit/channel, that is extra bandwidth and memory being wasted.
Power-of-two encoding/decoding might be much more natural. A non-power-of-two might require significant changes to encoders/decoders.
Since the IHDR is a required header already, I don't think we could change it. We could add an optional iHDR or similar to say "Actually, ignore that. This is 10-bit/channel." But then an old decoder wouldn't know how to parse the data. Because of this, I haven't been able to think up a backwards compatible way to store the image data.

My initial thoughts:

I think I would want a 10-bit/channel option. It seems a tad odd to have HDR and not have it.
I don't yet know a way to support it and be backwards compatible.
Maybe it would be okay to allow a 10-bit/channel option that isn't backward compatible. People can use 16-bit/channel until 10-bit/channel is widely supported.
I should investigate how big of a leap it would be for common encoders/decoders to add support.

ProgramMax commented 3 years ago

Maybe we could treat it like a standard 8-bit/channel PNG for backwards compatibility.

For non-palette images, a new chunk could say "Actually, we're 10-bit/channel. Here are the extra 2 low-order bits in a new IDAT-like chunk."

Because they would be low-order, they only add detail. So an older decoder would get as close as it can.

One drawback I foresee is--being the low-order bits where the fine details exists--I imagine this data won't compress well. The filter (which normally benefits from local similarity) may not be very helpful, either.

For palette images, there could be a new PLTE-like chunk that uses the same indices and adds the extra bits.

lrosenthol commented 3 years ago

Rather than trying to change PNG in a way that is clearly not-backwards compatible, why not use this as an opportunity to move the web community to modern raster image formats - that does support 10 bits per channel - such as AVIF and JPEG-XL?

svgeesus commented 3 years ago

Stating how many bits are significant (when padded to the next-highest multiple of two) is the job of the sBIT chunk.

The question is not, how big is a raw image at 10bit rather than 16, but how big is the compressed image once a) filtered and b) zlib compressed.

I imagine (but have not verified experimentally, which would be an interesting result to see) that applying the existing filters and compression onto packed 10bit data would not give better results than applying them to 16bit data with 4 zeroed low-order bits

ProgramMax commented 3 years ago

Leonard, I definitely don't want to break backwards compatibility. If that had to happen, I would consider nudging toward other formats. But I think we can continue without breaking backwards compatibility.

Chris, I think you're right that sBIT is the way to handle this. And that 16bit's extra zeroes likely compress quite well.

Would it be worthwhile to have a special callout to implementers of encoders/decoders? I had actually read the sBIT chunk doc you linked while considering this. But I still skipped over it, having not fully understood it's purpose. For example, as a decoder that wants to use a 10bit texture format, I need to check the IHDR and the sBIT. I would be willing to bet money that most implementers look only at the IHDR and stick with a format which matches it.

ProgramMax commented 3 years ago

Thinking out pros and cons:

Using the sBIT chunk, we scale colors to be lighter. So whites remain white, but blacks become gray. Decoders that understand the sBIT chunk will correct for this. Decoders that don't will display the awkwardly brighter image.
Using a new, extra chunk that says "You already have 8 bits. Here are 2 more" will be handled better by decoders that don't understand the chunk. They will display black blacks and white whites. The difference would literally be the extra bits of precision: IE 0/255 red vs. 1/1024 red. It'll be as close as it can get to the actual image.

So is it worth it? I'm not sure. Adding new chunks (especially these) would be awkward. I'm not sure how many decoders already understand the sBIT chunk. And I don't know how many apps map 10-bit sBIT to a 10-bit texture.

Right now, I think the sBIT option is better. What would change my mind is if I learn many decoders don't understand the sBIT chunk and we end up with a lot of 10-bit images being used. If we anticipate that to be the case, I think the awkward new chunks is the better option.

palemieux commented 3 years ago

2021-08-02: add note/warning that sBIT chunk is expected to be used because 10-bit color channels are regularly used with HDR imagery. The note should also cover 12-bit color channels. Add recommendation on how the encoder should fill the unused bits.

jyrkialakuijala commented 2 years ago

Both 10 and 16 bit PNGs compress very inefficiently. Every second byte is about the least significant bits and every second about the most significant bits. These have very different probability distributions and thus entropy codings. Unfortunately PNG is not able to context model this.

A decent solution would be to use WebP lossless twice, once for the higher 8 bits and another image for the lower bits, or just use JPEG XL's lossless coding. Either of these will be 2-3x less bytes than using PNG for HDR.

ProgramMax commented 2 years ago

I didn't fully understand what you are saying.

Does WebP have one compression stream of the high bytes and another compression stream for the low bytes?

michaeldsmith commented 2 years ago

have you considered storing 10bits ab cdef ghij in 16bit container as abcd efgh ij00 0000 ? Another option is abcd efgh ij01 1111 which may be more compatible with 16-bit workflows that aren't enabled for or aware that the data is actually 10bit. I've seen a similar process work well for digital cinema DCDM 12bit image data stored in 16bits of TIF as abcd efgh ijkl 0111 as described in SMPTE RP 428-5 Section 4.3 [1] [1] https://ieeexplore.ieee.org/document/7291227

ProgramMax commented 2 years ago

@svgeesus's first comment about using the sBIT chunk is exactly "Storing the 10 bits in a 16 bit container". It lets you use a 16 bit PNG but specify that you're only really using 10 of those 16 bits.

Filling the spare bits with the 0111... pattern is clever :D

jyrkialakuijala commented 2 years ago

Often we use abcd efgh ijab cdef -- that way we can reach both 0000 0000 0000 0000 and 1111 1111 1111 1111. Being able to express black and white the same regardless of the bit depth interpretation helps in interoperability and simplifies getting things right.

jyrkialakuijala commented 2 years ago

I didn't fully understand what you are saying.

Does WebP have one compression stream of the high bytes and another compression stream for the low bytes?

Not yet. We would need to add that to the spec and to the decoder. It would be technically miraculously simple to add in both. The only minor difficulty is in integration to HDR, but that problem exist with any solution.

JPEG XL has the support in the spec already as well as a working demo using the ColorWeb-CG ideas. https://eustas.github.io/jxl-demo/index.html?colorSpace=rec2100-hlg&img=2 is a WASM demo and using it requires the Chrome's experimental HDR canvas flag to be enabled.

palemieux commented 2 years ago

As detailed above, PNG today supports packing 10 bit words into 16-bit words (using the sBIT chunk).

More efficient coding of 10-bit (and 16-bit) words requires a different coding technique, but that is a totally different (and more complex) issue IMHO.

Myndex commented 1 year ago

....More efficient coding of 10-bit (and 16-bit) words requires a different coding technique, but that is a totally different (and more complex) issue IMHO....

Not your granddad's DPX

When I stumbled onto this thread, first thought was the existing three channels of 10bit into four bytes format of DPX, but I don't think that fits well into png.

The Alpha Wolf Shares with the Pack

But if there is a desire to conserve bandwidth, it occurred to me that the color type 6 8bit rgba png, might easilly be modified so the two LSbits of each 10bit channel are mapped onto the 6 LSbs of the alpha channel, and the 2 MSbs of the alpha could still be used for a one or two bit alpha.

A two bit alpha could be combined with the tRNS chunk to have 4 indexed transparency values (though that may cause compatibility issues), otherwise, 0b00 = 0% opaque, 0b01 = 33%, 0b10 = 66%, 0b11 = 100% opaque.

Fall Backwards (compatibility)?

The next question is, is there a configuration where a decoder/viewer that was not capable of handling this segmented10bit format, and just discarded the bits in the alpha as if fully opaque? In this case, the LSbs would be truncated, and while truncation is a poor way to handle down sampling, and does have artifacts, the image would still be reasonably viewable.

The sBIT chunk provides the way to make this happen, to mask the LSbs in the alpha and also show the 2 MSbs of transparency, so a 10 bit image could display with a reasonable fallback in a naïve/legacy viewer.

Setting sBIT to 8 8 8 2, then current decoders/viewers should display the truncated-to-8 image okay, and with the alpha sBIT at 2 bits, only the two MSbs would be used, with the 6 LSb rgb bits being hidden.

Virtual Signaling

As per the graphic below, the IHDR chunk would indicate 8 bit and color type 6, for fallback compatibility. So how to tell the decoder we're a segmented 10bit image? We use a tEXt chunk with one string that says segmented10bit to signal the format, and this maintains backwards compatibility.

10 bit png a visual info graphic that shows the data arrangement for a 10 bit PNG, and the chunks that would be involved

Advantages

1) A 10bit png format with all the advantages of png, but a bit depth to match many current video and HDR formats.

2) Should compress similarly to an 8bit RGBA png. First three bytes are not unlike an 8 bit image, then the one LSb/alpha byte, which might not compress as small as a typical alpha, but overall this scheme should be substantially more efficient than 10-in-16.

3) So long as the decoder supports sBIT, this should be backwards compatible, with the caveat that images with truncated LSbs may have artifacts.

4) And finally, though tests need to be run, this seems like an efficient way to handle 10bit images as far as the compression and total data size is concerned.

IMO it has the potential for use in an image-sequence stream.

Curious your thoughts?

Thank you for reading.

palemieux commented 1 year ago

@ProgramMax What about moving this thread to https://github.com/w3c/PNG-spec , now that there is a dedicated WG for PNG maintenance?

palemieux commented 1 year ago

Moved to https://github.com/w3c/PNG-spec/issues/357

w3c / ColorWeb-CG