image-rs / jpeg-decoder

JPEG decoder written in Rust
Apache License 2.0
149 stars 87 forks source link

Progressive images now render incomplete coefficients #177

Closed quilan1 closed 3 years ago

quilan1 commented 3 years ago

Progressive jpegs have multiple passes on their spectral coefficients. This library currently waits for all coefficients to have been read in by the various segments before it sends the coefficient data to the DCT engine.

However, some images have (perhaps incorrectly) incomplete passes on the coefficient data, where eg. a component might only have spectral coefficients 0 through 5 sent due to perhaps an improperly written encoder. In these scenarios this library would yield the error, 'not all components have data' instead of a more forgiving approach.

This change adds a final pass over the components when in a progressive frame. The pass checks all components to determine if any have missing coefficients (ie. haven't been processed yet), and if this is the case it will send the components to the DCT engine to be processed, and then collected. This way, the library is more forgiving to incomplete spectral information present in certain progressive files.

This addresses cases like #96 (although the original image is long since lost to history; the image from image-rs/image#738 was used for testing, as well as multiple constructed examples).

quilan1 commented 3 years ago

I can close this if the library is intending on being a more strict interpreter of the standard, but I feel that this change should bring the experience closer to what most consumers are expecting, a best-effort rendering. Also, the code was mostly grabbed with minor modification from the decode_scan function; if any refactoring and/or abstraction is required, that can be done; I figured that the code was short enough that it probably didn't need it.

HeroicKatora commented 3 years ago

/maintainer hat on: having the code to implement the fallback is incredibly useful.

/maintainer hat off: Personally, 'best-effort' sounds very nice. However, I do believe it should not be encouraged by default as it can ossify non-standard behavior, preventing future changes as well as presenting a barrier of entry for interoperability with new requirements that can not be found in an specification. There are a few mechanism to achieve this, for example:

quilan1 commented 3 years ago

Completely understood. New to the process here so if that is the case, what would the next steps be, and how could I help facilitate this?

That said, in this specific case, reviewing the specification I don't necessarily think that this particular best-effort approach is incorrect nor, more importantly, against spec. That said, I don't have access to T.83 to check against the compliance documentation.

From T.81 though, I've been considering the language of "4.5 Modes of operation":

There are two procedures by which the quantized coefficients in the buffer may be partially encoded within a scan. First, only a specified band of coefficients from the zig-zag sequence need be encoded. This procedure is called spectral selection, because each band typically contains coefficients which occupy a lower or higher part of the frequency spectrum for that 8x8 block. Secondly, the coefficients within the current band need not be encoded to their full (quantized) accuracy within each scan. Upon a coefficient’s first encoding, a specified number of most significant bits is encoded first. In subsequent scans, the less significant bits are then encoded. This procedure is called successive approximation. Either procedure may be used separately, or they may be mixed in flexible combinations.

and "K.7.1 Progressive coding of the DCT"

The simplest progressive coding technique is spectral selection; indeed, because of this simplicity, some applications may choose – despite the limited progression that can be achieved – to use only spectral selection. Note, however, that the absence of high frequency bands typically leads – for a given bit rate – to a significantly lower image quality in the intermediate stages than can be achieved with the more general progressions. The net coding efficiency at the completion of the final stage is typically comparable to or slightly less than that achieved with the sequential DCT.

To me, this suggests that the high-order successive approximation and/or the specific channels spectrally selected are optional and quite flexible for an encoder, but I'm still new to the spec and there seems to be some leeway in interpretation here. I defer to any who have more experience with the spec.

Additionally, a tangential question -- when submitting changes, if I must make edits to the code (due to review or other reason), is it preferred by the community to amend a single commit and force-push it to my own branch (such that the final merge is a single commit), or add additional commits?

HeroicKatora commented 3 years ago

Hm, interesting. Not that I'm an expert on jpeg, and I did not find the available specification clear on this either. When it allows omission of bands of coefficients, would that include—even though not very sensible—the possibility of only specifying higher bands? In the case of ambiguity in the spec I'd definitely go with recovering from imaginable uncommon inputs, and we don't need a flag.

Additionally, a tangential question -- when submitting changes, if I must make edits to the code (due to review or other reason), is it preferred by the community to amend a single commit and force-push it to my own branch (such that the final merge is a single commit), or add additional commits?

Depends on the part of the community you refer to. We don't have a policy for a single commit in image-rs. Having a history without reverts makes it easier to rebase in case someone else picks up work. As long as the commits are tidy and the end result is accepted, it's good as well. Also rebase&force pushing is preferred to merging. In the unlikely case some change absolutely must be a single commit, Github offers a 'Squash' option on its own merge button.

quilan1 commented 3 years ago

I suppose one pathological case that may occur is if there's a frame (SOF) but no corresponding scan (SOS). In this current PR, it would currently panic in the introduced code. The fix is trivial, and I'll be pushing it shortly. I think the correct behavior would be to yield the current 'no data' error.

quilan1 commented 3 years ago

The earlier mentioned panic has been resolved.

There are now three new test images:

Regarding missing AC/DC coefficients, I wasn't able to find any software that registered any error on a file that omitted various progressive passes (aside from the missing-sos file, but that's a special case). I'd imagine that accepted industry standard is to accept this behavior as acceptable.

edit: Way too many 'accepts' in that sentence.

HeroicKatora commented 3 years ago

Good job catching that :tada: And I do agree with your analysis on handling.

quilan1 commented 3 years ago

Ran the PR through the Wallhaven dataset mentioned in one of the linked issues because well... a large dataset is a dataset, regardless of content, I suppose. There were three errors encountered in the dataset: