anforowicz commented 2 months ago

Disclaimer

This issue is based on my current, evolving understanding of the requirements that Chromium and Skia architectures impose upon an underlying codec implementation. I am opening this issue early to facilitate a transparent discussion, but before accepting the new APIs we should definitely require that I first finish prototyping the APNG-focused part of the Chromium/Blink/Skia integration (as a proof-of-concept that the new APIs actually work and solve my use case).

Problem 1a: No way to get to next image data

When Reader.next_interlaced_row reaches the end of a frame, then it gets "stuck" - subsequent calls will continue returning None (which correctly indicates that no more rows in the current frame) and there is no way to proceed to the next frame.

Problem 1b: No way to read the next `fcTL` chunk

To successfully decode a frame, one needs to have its FrameControl data, but Info.frame_control won't be updated until the fcTL chunk is encountered and parsed. There is no way to ask Reader to get the next FrameControl metadata.

Problem 2: No way to go back and decode an earlier frame

Skia and Blink abstractions allow asking a codec to (again) decode an earlier frame:

Skia allows that via Options.fFrameIndex parameter of SkCodec::onStartIncrementalDecode (and onGetPixels) which can specify an earlier frame.
- Example implementation can be found in the SkWuffsCodec here
- Note that there is no requirement to go to a later frame - the codec may choose to not report frames for which no metadata has been seen yet. For example, SkWuffsCodec populates its fFrames field incrementally - see here.
Chromium allows that via index parameter of blink::ImageDecoder::DecodeFrameBufferAtIndex and Decode
- Example implementation can be found in the libpng-backed PNGImageReader here

Early discussion

Initially I tried solving problem 1 by publicly exposing Reader.read_until_image_data.

One solution to problem 2 would be to (brainstorming quality, please bear with me):

Add Reader.seek_to_frame(&mut self, frame_index: usize) -> Result<&FrameControl, DecodingError> method that is hidden behind where R: Seek
- As proposed above, this would be only for APNG frames. If IDAT is not part of the animation, then there is no way to seek back to IDAT.
- Implementation would seek to the start of compressed data stream in an IDAT or fdAT chunk and would need to restore the related state in StreamingDecoder.
  - Option 1: restoring to State::ImageData(...) or State::ApngSequenceNumber (this requires remembering and restoring current_chunk - type, crc, remaining
  - Option 2: restoring to State::U32 { kind: U32ValueKind::Length } (no need to remember chunk contents; requires seeking to a slightly earlier offset)
Modify Reader.read_until_image_data to store frame offsets if possible
- I think that we can find a way to store offset = None when R: !Seek (maybe by introducing a private trait that is implemented for Read + !Seek and for Read + Seek)
- In addition to offsets, for each frame we would have to store the data needed to restore decoding state (as described earlier):
  - FrameControl
  - fdAT / fcTL sequence number (can add a crate-internal accessor to StreamingDecoder? or maybe we Decoded::ChunkBegin should be replaced with Decoded::DataChunkBegin(opaque_state_to_restore)?)

I think that if we have seek_to_frame API, then we would still need to make read_until_image_data public (to support moving to the next fcTL when the input stream doesn't implement Seek).

I am not sure if there are other, better ways to solve problem 2.

I note that libpng and StreamingDecoder use a push model (slices with additional data is pushed into their APIs) and Reader uses a pull model (calling Reader APIs triggers Read.read from the underlying data stream). In a push model the client is responsible for managing seeking/offsets. But a push model may still have a state that needs to be restored (I am not sure how exactly that works in libpng).

anforowicz commented 2 months ago

Let me also be transparent that I am not yet 100% sure about the motivation here.

The immediate motivation is that the AnimatedPNGTests.ParseAndDecodeByteByByte test requires seeking to an earlier frame (at least as currently written), because it calls DecodeFrameBufferAtIndex in two separate loops: here and here. But fixing this test doesn't necessarily require seeking to an arbitrary earlier frame:

Restarting from the beginning of the PNG file is sufficient to fix the test. For example, one way forward may be to have SkPngRustCodec::startDecoding handle 1) decoding the current/latest frame, or 2) restarting for options.fFrameIndex == 0, (rather than supporting arbitrary frame indices).
Alternatively, the test itself could be rewritten to avoid the 2nd loop. (During the first loop the image is decoded byte-by-byte, testing the codec's ability to resume after UnexpectedEof - this seems what this test helper wants to focus on. Re-decoding and checking image hashes in the 2nd loop tests a slightly different thing.)

I am working with the Skia team to better understand the motivation for supporting seeking to an earlier frame. One hypothesis is that this is needed when decoding the current frame depends on first restoring one of previous frames (e.g. because one or more earlier frames use DisposeOp.Previous). Discarding previous frames seems to be handled by SkiaImageDecoderBase::CanReusePreviousFrameBuffer, but maybe previous frames can still be discarded in some scenarios (just guessing: maybe when working under memory pressure?).

fintelia commented 1 month ago

Seeking back to the first frame makes sense to me, but I'd be curious what use seeking to a different frame would be? I'd expect that you wouldn't be able to reconstruct it without potentially needing to know all previous frames.

Another question I have looking at the current API is that we don't seem to distinguish between the static image and the first frame of the animation, which may or may not be the same. Might be worth separating them like image-webp does.

As far as API changes, it might be worth investigating whether to unconditionally require Seek in the next major release. That would also enable the decoder to save the locations of metadata chunks and decode them lazily when requested, rather than always saving them without knowing whether they'll be needed

HeroicKatora commented 1 month ago

The comparison to image-webp and tiff is interesting, there's lots of overlap in the required seek functionality. In particular, I can well see the use-case in building a list of all fcTL chunks which includes the necessary information to compute any presentation dependency (i.e. walk backwards until a full dispose, or opaque blending with the full area) . With that graph an efficient lookup and rebuild for individual frames is also possible, similar to keyframes. I think all of this can algorithmically and in types live separate from the encoder itself. That introduces a slight imprecision where the depencency graph of an image might be used with a mismatched decoder, but I think that's more than acceptable instead of introducing a lot of complexity within the decoder.

As an aside: with regards to Seek as bound, these exact issues are good reasons to investigate Seek as a runtime property instead a compile time. You're now aware of the problem space, please consider this solution prototype. Significant unsafe and an optional unstable attribute, so escalate these internally to wherever appropriate.

anforowicz commented 1 month ago

For now I've implemented seeking to an earlier frames by 1) rewinding the input stream to the beginning and recreating png::Reader and 2) reading/moving frames one-by-one until the desired frame. This lets me make further progress on passing additional Chromium/Blink tests, and set the efficient-seeking problem as something that we may reconsider later - for now I've opened https://crbug.com/371060427 to track this on my side.

we don't seem to distinguish between the static image and the first frame of the animation, which may or may not be the same.

Yes. FWIW this is not blocking - the client can recognize the situation (animation_control present, but frame_control missing in the image Info) and handle it as needed (skipping the first frame somehow - either via next_frame, or by a new API that wraps read_until_image_data)

Seeking back to the first frame makes sense to me, but I'd be curious what use seeking to a different frame would be? I'd expect that you wouldn't be able to reconstruct it without potentially needing to know all previous frames.

I can well see the use-case in building a list of all fcTL chunks which includes the necessary information to compute any presentation dependency (i.e. walk backwards until a full dispose, or opaque blending with the full area) . With that graph an efficient lookup and rebuild for individual frames is also possible, similar to keyframes. I think all of this can algorithmically and in types live separate from the encoder itself.

Right - see how:

SkiaImageDecoderBase::Decode calculates which previous frames need to be decoded
SkFrameHolder::setAlphaAndRequiredFrame calculates whether some previous frame is needed (based on dispose and blend modes) to decode a given frame (the "requires" relationship is for a full image buffer, not just for the previous frame's subject).

(Note that the code above is codec-independent.)

But, I am still not sure in what scenario a previous/required animation frame would be unavailable. If required frames may be discarded (out-of-viewport? low-on-memory?) then maybe it should be okay to restart the animation from the first frame? I dunno...

anforowicz commented 3 weeks ago

I think implementing seek-to-frame-N as rewind-and-skip-N-1-frames can be made a bit faster if we won't try to decompress the image data as we skip over the frames. FWW I've prototyped this at https://github.com/image-rs/image-png/compare/a268b031276c0df43452a852ad97e6b63e4dfbc1...anforowicz:image-png:image-data-sink-fn.

kornelski commented 3 weeks ago

@anforowicz I've implemented a similar compression skipping for GIF. Maybe use similar approach in PNG if possible?

https://github.com/image-rs/image-gif/blob/51ea9b0bf334b918bd24751812ea49c3bf1152f8/src/reader/decoder.rs#L755

anforowicz commented 3 weeks ago

Thanks for the pointer @kornelski. It's interesting how gif follows a similar StreamingDecoder / Decoder pattern - I didn't realize this before. IIUC gif::StreamingDecoder doesn't provide a way to mutate its skip_frame_decoding field, but in the png case we could support crate-internal mutation (via, say StreamingDecoder.set_skip_frame_decoding(new_value: bool)), and then ReadDecoder::finish_decoding_image_data could call self.decoder.set_skip_frame_decoding(true) (and ReadDecoder::read_until_image_data could set skipping to false; I think other methods wouldn't touch this flag/option, including ReadDecoder::decode_image_data). This is indeed a bit simpler: https://github.com/image-rs/image-png/compare/6016c9bee0fbd6244ebe49ed19a25094898c8001...anforowicz:image-png:skip-frame-decoding-flag-in-streaming-decoder. I'll see if I can polish it up and submit as a PR next week.

image-rs / image-png

Need public API to seek to another APNG frame #510

Disclaimer

Problem 1a: No way to get to next image data

Problem 1b: No way to read the next `fcTL` chunk

Problem 2: No way to go back and decode an earlier frame

Early discussion

image-rs / image-png

Need public API to seek to another APNG frame #510

Disclaimer

Problem 1a: No way to get to next image data

Problem 1b: No way to read the next fcTL chunk

Problem 2: No way to go back and decode an earlier frame

Early discussion

Problem 1b: No way to read the next `fcTL` chunk