w3c / png

Maintenance of the PNG specification
https://w3c.github.io/png/
Other
46 stars 11 forks source link

PNG Restart Marker specification #60

Open randy408 opened 2 years ago

randy408 commented 2 years ago

This extension introduces restart markers for PNG images with a new chunk and a backward-compatible encoding method, it enables parallel encoding and decoding of image data.

The IDAT stream is split into segments with each segment corresponding to a subset of the image. For compression method 0 (DEFLATE) each segment ends with a full flush marker, allowing each segment of the image data to be decompressed in parallel.

To eliminate data dependencies between segments the first scanline of each segment's filter type is restricted to algorithm's that do not reference the previous scanline, allowing each segment to be encoded and decoded in parallel.

It defines one segmentation method (which basically splits the image horizontally into N strips) and two segmentation types, one with offsets and one without. I believe there should be only one but for now this will allow to us to implement both and weigh the pros and cons.

https://github.com/libspng/png-restart-marker

The repository includes a RATIONALE.md for some of the design decisions. I also have an implementation but it's for an older revision of this spec, will have that ready soon.

EDIT: and those pesky interlaced images, I can't think of a sane way to split them up beyond 2 segments so I just excluded them entirely from the spec.

randy408 commented 2 years ago

I wonder what should be required in terms of error recovery, others have already brought up specially crafted images where decoders have to guard against extra trailing data for each subset of image data, otherwise more rows can be added between them that would only show up with a non-parallel decoder.

The worst part about that is it's technically standard conformant because trailing data at the very end is allowed due to underspecification. From the perspective of the PNG standard the extra rows are the ones at the end of the deflate stream.

This also means a parallel decoder can enter a state where the majority of the decoded rows are invalidated because the extra data caused them to be decompressed from the wrong offset, at that point the image data to row number relationship is undefined.

Assuming this is all checked for, treating this as a hard error wouldn't be good enough because it would mean an optional chunk can cause an otherwise standard PNG stream to fail decode in some cases, but not in others, this can lead to vulnerabilities.

The PNG standard requires conformant streams to be decoded without issues. The logical conclusion is hard errors are not allowed during parallel decoding and falling back to single-threaded decoding is always required.

If linear decoding does not fail it cannot report an error, that's how I interpret it.

The question is how should that be implemented, technically it's the library's job to not fail. One sensible solution would be to reset to the first IDAT chunk and start over on the caller's thread, but that is impractical with progressive decoding or if threading is handled by the application.

svgeesus commented 1 year ago

Since this proposal has no backwards-incompatible issues, I feel we as a group should discuss it sometime and at least get agreement (or not) on the overall direction. It could then be added to the PNG Extensions spec, if we want more experience with it before adding to the main PNG spec.

LegionMammal978 commented 2 months ago

One concern for decoders: if segmentation type 0 is used, then a decoder must ensure that the IDAT chunk at the start of each segment doesn't overlap with a chunk at the end of the previous segment, i.e., every offset must correspond to a chunk boundary. Otherwise, it could process something that looks like an IDAT chunk, but is actually just data within another chunk.

Also, if segment type 0 allows each segment to contain more than one IDAT chunk, it might be worthwhile to note that explicitly, since it means that a parallel decoder couldn't just ignore the offsets and look at the IDAT chunks.