phoboslab / qoi

The “Quite OK Image Format” for fast, lossless image compression
MIT License
6.83k stars 328 forks source link

Spec clarifications #253

Open andrews05 opened 1 year ago

andrews05 commented 1 year ago

Hi, I think this is a quite okay image format and I've been working on a Swift implementation. I have a couple of questions regarding edges cases:

Truncated data In the reference decoder, if there isn't enough data to cover all the pixels it will continue to fill the remainder of the image with the current pixel. So a file with no pixel data, just the header + padding, will be entirely black. The spec doesn't discuss this though, so my question is:

  1. Should I match the behaviour of the reference decoder here?
  2. Is it safe for the encoder to deliberately end early when the remainder of the image is the same? I ask because the format is relatively poor for long runs and I can save a significant amount in my use case just by ending early. (Ideally, I would replace the DIFF op code with an extended 14-bit run 😄)

Initial colour and the index The reference decoder stores pixel results from all op-codes in the index but the encoder does not store pixels for either runs or indexes into the index. For index op-codes this is irrelevant but there's a minor edge case for runs: If the image starts with black then it will encode a run, as the initial prev pixel is black. If the next pixel is completely different it may then do an RGB/RGBA. If we then have more black, we find the initial black never got put into the index so we have to do another RGB/RGBA. The spec doesn't explicitly cover this but I can see a couple of possible improvements in the code that could be made here, depending on what's considered valid:

  1. The encoder could store runs into the index (or just the initial black) for a very minor size optimisation in this edge case. The current decoder will handle this correctly.
  2. Alternatively, the decoder could assume the initial black never gets indexed and thus safely omit storing both index and run op-codes in the index for a performance boost.
  3. Both of the above, with the decoder starting by storing the initial black in the index to cover the edge case.

In any case, some clarification in the spec about this would be helpful 🙂

amstan commented 1 year ago

If we then have more black, we find the initial black never got put into the index

Yeah, the spec should clarify that the index starts set to all 0s (but this includes an alpha channel setting of 0).

phoboslab commented 1 year ago

So a file with no pixel data, just the header + padding, will be entirely black. The spec doesn't discuss this though

You're right. The spec doesn't mention how to handle broken/invalid QOI files. You may handle it on a "best effort" basis as the reference decoder does or just return an error. For what it's worth, the reference decoder also doesn't check if the padding at the end is valid. It just assumes it's there and it's 8 bytes wide.

The spec however says that your given example is indeed invalid:

An image is complete when all pixels specified by width * height have been covered.

That is: a valid QOI file touches all pixels. It should not be truncated.

If the next pixel is completely different it may then do an RGB/RGBA. If we then have more black, we find the initial black never got put into the index

Good catch! That's a shortcoming of the reference encoder. The spec is actually quite clear on this:

Each pixel that is seen by the encoder and decoder is put into [the index] (...)

Doing it either way (storing the run in the index or not) will produce a valid QOI file, so the encoder is still spec compliant even if it may not produce the smallest file possible. What performance/compression tradeoffs you make in the encoder is totally up to you, as long as it produces a valid QOI file. Your decoder however must handle this case - i.e. it must put the black pixel in the index.

Alternatively, the decoder could assume the initial black never gets indexed and thus safely omit storing both index and run op-codes in the index for a performance boost.

Both of these cases should be implicit: indexed colors are naturally already in the index and runs just repeat the previous color (which was put into the index by any other op). So if, in your decoder, you put the initial black into the index before decoding, you could safely omit storing to the index for all INDEX and RUN ops to get your performance boost. This should give you the same behavior as if you'd store colors for all ops in the index (please correct me if I'm wrong - I haven't thought about this thoroughly).

the spec should clarify that the index starts set to all 0s (but this includes an alpha channel setting of 0).

It does :)

A running array[64] (zero-initialized) of previously seen pixel values is maintained by the encoder and decoder.

amstan commented 1 year ago

It does :)

Sorry, I guess I was looking at your one pager and considering that as the "spec".

phoboslab commented 1 year ago

The one pager PDF is the spec. It is mirrored in the qoi.h.

andrews05 commented 1 year ago

You're right. The spec doesn't mention how to handle broken/invalid QOI files. You may handle it on a "best effort" basis as the reference decoder does or just return an error. For what it's worth, the reference decoder also doesn't check if the padding at the end is valid. It just assumes it's there and it's 8 bytes wide.

The spec however says that your given example is indeed invalid:

An image is complete when all pixels specified by width * height have been covered.

That is: a valid QOI file touches all pixels. It should not be truncated.

Cool, thanks for the clarification. Given that the reference decoder actually has well-defined behaviour in this case, it might be good to explicitly state in the spec that such an image is invalid and behaviour should be considered undefined. The words "An image is complete..." seems a little vague (I didn't read this as suggesting an incomplete image is invalid).

Both of these cases should be implicit: indexed colors are naturally already in the index and runs just repeat the previous color (which was put into the index by any other op). So if, in your decoder, you put the initial black into the index before decoding, you could safely omit storing to the index for all INDEX and RUN ops to get your performance boost. This should give you the same behavior as if you'd store colors for all ops in the index (please correct me if I'm wrong - I haven't thought about this thoroughly).

Right, you are correct here. Do you think it would be good for the reference decoder to do this? I measured a 4-10% improvement on qoibench when I tried it. Happy to submit a PR if you like. Note there is a slight danger here: It was very tempting to do the same for encoder (putting the initial black into the index to cover the edge case), but then I realised it would technically be spec-breaking if the image didn't start with black, even though it would work perfectly with the modified decoder. By having the reference decoder correctly handle such non-spec images, it could lead other encoder implementations to fall into the same temptation. [edit] Never mind the above, the issue is trivially avoided - rather than always storing the initial black, store the pixel for a run only if it's the first pixel.

homoclient commented 1 year ago

So if, in your decoder, you put the initial black into the index before decoding, you could safely omit storing to the index for all INDEX and RUN ops to get your performance boost. This should give you the same behavior as if you'd store colors for all ops in the index

I think that there is a very unlikely case where this method causes uncertainty. If the decoder chose to always store the initial opaque black pixel in the index before decoding, and later during the decoding process, every other 63 positions in the index had been touched while index[53] (the position for initial black) still untouched, and then an OP_INDEX 53 appeared, the decoder would possibly produce wrong result. Because the encoder producing this image might be using OP_INDEX 53 here as a way to put a transparent black, however the decoder assume the untouched index[53] is always initially opaque black.

If the encoder guarantees never referring to an untouched index color, this method will be perfectly fine, but it means more checking in the encoding process, and the encoder implementation will lose the opportunity of referring to an untouched index to get a transparent black. If the encoder can refer to any pixel in the index as it wish, even if that position hasn't got any color stored into, then I guess

[edit] Never mind the above, the issue is trivially avoided - rather than always storing the initial black, store the pixel for a run only if it's the first pixel.

would be the only choice. I have just realized this case when I am starting to write the encoder. I know this is ridiculously unlikely but since it doesn't violate the spec, I think it's worth considering.

KasumiArai commented 3 months ago

Apologies if this has been asked elsewhere: Given an 8x1 image, is a run of say 40 officially valid or invalid? That is, a run with a count of pixels that would go through the end of the image if the decoder did not stop. The spec makes me think they're valid.

An image is complete when all pixels specified by width * height have been covered.

To me, that reads as if the decoder should stop in the middle of an incomplete run if all pixels have been touched. I don't intend to encode such files, but I want to know if it's okay to reject them on an official basis. If they're valid, I'll have to re-encode them for my use case

phoboslab commented 3 months ago

Imho it's quite clear that an "8x1 image with a run of 40" is not a well formed QOI image; it breaks logical assumptions. So, yes, you may reject it.

To quote myself:

The spec doesn't mention how to handle broken/invalid QOI files. You may handle it on a "best effort" basis as the reference decoder does or just return an error

oscardssmith commented 3 months ago

I think it probably would be worth specifying that invalid files must error. other formats have had security issues due to different implementations making different decisions about what valid files are.

amstan commented 3 months ago

I think it probably would be worth specifying that invalid files must error.

When one implementing a decoder/encoder, one should probably follow https://deviq.com/laws/postels-law

I wouldn't add too many checks/error triggers if the result of decoding still yields something.

KasumiArai commented 3 months ago

Thank you! Certainly it's not up to the spec to tell how to handle an invalid file, I just needed to know if such a file was considered invalid. If it was not, I'd have been rejecting valid files. I'm happy to reject now. I may re-encode later.

(If anyone was curious, I'm making a real-time decoder on an embedded device with a 67mhz CPU. Not doing error checking made my decoder twice as fast. I do verify the files on load before they are passed to the unsafe decoder at runtime, so I could have re-encoded the "questionable" ones at that time, but it would have been extra work.)

amstan commented 3 months ago

(If anyone was curious, I'm making a real-time decoder on an embedded device with a 67mhz CPU.

Nice!

Not doing error checking made my decoder twice as fast.

Omg, don't even think about it then, why would you have superflous checks??

Design your decoder so it doesn't need checks, but it just decodes in a safe manner to begin with. It's probably a better design.

Note how qoi_decode only returns NULL when it's super easy to tell something's wrong. It certainly doesn't have ifs inside the tight loop that will reject the file by returning NULL. The only safety critical thing it does is making sure p doesn't grow bigger than size