Open anforowicz opened 2 months ago
Let me also be transparent that I am not yet 100% sure about the motivation here.
The immediate motivation is that the AnimatedPNGTests.ParseAndDecodeByteByByte
test requires seeking to an earlier frame (at least as currently written), because it calls DecodeFrameBufferAtIndex
in two separate loops: here and here. But fixing this test doesn't necessarily require seeking to an arbitrary earlier frame:
SkPngRustCodec::startDecoding
handle 1) decoding the current/latest frame, or 2) restarting for options.fFrameIndex == 0
, (rather than supporting arbitrary frame indices).UnexpectedEof
- this seems what this test helper wants to focus on. Re-decoding and checking image hashes in the 2nd loop tests a slightly different thing.)I am working with the Skia team to better understand the motivation for supporting seeking to an earlier frame. One hypothesis is that this is needed when decoding the current frame depends on first restoring one of previous frames (e.g. because one or more earlier frames use DisposeOp.Previous
). Discarding previous frames seems to be handled by SkiaImageDecoderBase::CanReusePreviousFrameBuffer
, but maybe previous frames can still be discarded in some scenarios (just guessing: maybe when working under memory pressure?).
Seeking back to the first frame makes sense to me, but I'd be curious what use seeking to a different frame would be? I'd expect that you wouldn't be able to reconstruct it without potentially needing to know all previous frames.
Another question I have looking at the current API is that we don't seem to distinguish between the static image and the first frame of the animation, which may or may not be the same. Might be worth separating them like image-webp does.
As far as API changes, it might be worth investigating whether to unconditionally require Seek
in the next major release. That would also enable the decoder to save the locations of metadata chunks and decode them lazily when requested, rather than always saving them without knowing whether they'll be needed
The comparison to image-webp
and tiff
is interesting, there's lots of overlap in the required seek functionality. In particular, I can well see the use-case in building a list of all fcTL
chunks which includes the necessary information to compute any presentation dependency (i.e. walk backwards until a full dispose, or opaque blending with the full area) . With that graph an efficient lookup and rebuild for individual frames is also possible, similar to keyframes. I think all of this can algorithmically and in types live separate from the encoder itself. That introduces a slight imprecision where the depencency graph of an image might be used with a mismatched decoder, but I think that's more than acceptable instead of introducing a lot of complexity within the decoder.
As an aside: with regards to Seek
as bound, these exact issues are good reasons to investigate Seek
as a runtime property instead a compile time. You're now aware of the problem space, please consider this solution prototype. Significant unsafe
and an optional unstable attribute, so escalate these internally to wherever appropriate.
For now I've implemented seeking to an earlier frames by 1) rewinding the input stream to the beginning and recreating png::Reader
and 2) reading/moving frames one-by-one until the desired frame. This lets me make further progress on passing additional Chromium/Blink tests, and set the efficient-seeking problem as something that we may reconsider later - for now I've opened https://crbug.com/371060427 to track this on my side.
we don't seem to distinguish between the static image and the first frame of the animation, which may or may not be the same.
Yes. FWIW this is not blocking - the client can recognize the situation (animation_control
present, but frame_control
missing in the image Info
) and handle it as needed (skipping the first frame somehow - either via next_frame
, or by a new API that wraps read_until_image_data
)
Seeking back to the first frame makes sense to me, but I'd be curious what use seeking to a different frame would be? I'd expect that you wouldn't be able to reconstruct it without potentially needing to know all previous frames.
I can well see the use-case in building a list of all fcTL chunks which includes the necessary information to compute any presentation dependency (i.e. walk backwards until a full dispose, or opaque blending with the full area) . With that graph an efficient lookup and rebuild for individual frames is also possible, similar to keyframes. I think all of this can algorithmically and in types live separate from the encoder itself.
Right - see how:
SkiaImageDecoderBase::Decode
calculates which previous frames need to be decodedSkFrameHolder::setAlphaAndRequiredFrame
calculates whether some previous frame is needed (based on dispose and blend modes) to decode a given frame (the "requires" relationship is for a full image buffer, not just for the previous frame's subject).(Note that the code above is codec-independent.)
But, I am still not sure in what scenario a previous/required animation frame would be unavailable. If required frames may be discarded (out-of-viewport? low-on-memory?) then maybe it should be okay to restart the animation from the first frame? I dunno...
I think implementing seek-to-frame-N as rewind-and-skip-N-1-frames can be made a bit faster if we won't try to decompress the image data as we skip over the frames. FWW I've prototyped this at https://github.com/image-rs/image-png/compare/a268b031276c0df43452a852ad97e6b63e4dfbc1...anforowicz:image-png:image-data-sink-fn.
@anforowicz I've implemented a similar compression skipping for GIF. Maybe use similar approach in PNG if possible?
Thanks for the pointer @kornelski. It's interesting how gif
follows a similar StreamingDecoder
/ Decoder
pattern - I didn't realize this before. IIUC gif::StreamingDecoder
doesn't provide a way to mutate its skip_frame_decoding
field, but in the png
case we could support crate-internal mutation (via, say StreamingDecoder.set_skip_frame_decoding(new_value: bool)
), and then ReadDecoder::finish_decoding_image_data
could call self.decoder.set_skip_frame_decoding(true)
(and ReadDecoder::read_until_image_data
could set skipping to false
; I think other methods wouldn't touch this flag/option, including ReadDecoder::decode_image_data
). This is indeed a bit simpler: https://github.com/image-rs/image-png/compare/6016c9bee0fbd6244ebe49ed19a25094898c8001...anforowicz:image-png:skip-frame-decoding-flag-in-streaming-decoder. I'll see if I can polish it up and submit as a PR next week.
Disclaimer
This issue is based on my current, evolving understanding of the requirements that Chromium and Skia architectures impose upon an underlying codec implementation. I am opening this issue early to facilitate a transparent discussion, but before accepting the new APIs we should definitely require that I first finish prototyping the APNG-focused part of the Chromium/Blink/Skia integration (as a proof-of-concept that the new APIs actually work and solve my use case).
Problem 1a: No way to get to next image data
When
Reader.next_interlaced_row
reaches the end of a frame, then it gets "stuck" - subsequent calls will continue returningNone
(which correctly indicates that no more rows in the current frame) and there is no way to proceed to the next frame.Problem 1b: No way to read the next
fcTL
chunkTo successfully decode a frame, one needs to have its
FrameControl
data, butInfo.frame_control
won't be updated until thefcTL
chunk is encountered and parsed. There is no way to askReader
to get the nextFrameControl
metadata.Problem 2: No way to go back and decode an earlier frame
Skia and Blink abstractions allow asking a codec to (again) decode an earlier frame:
Options.fFrameIndex
parameter ofSkCodec::onStartIncrementalDecode
(andonGetPixels
) which can specify an earlier frame.SkWuffsCodec
hereSkWuffsCodec
populates itsfFrames
field incrementally - see here.index
parameter ofblink::ImageDecoder::DecodeFrameBufferAtIndex
andDecode
libpng
-backedPNGImageReader
hereEarly discussion
Initially I tried solving problem 1 by publicly exposing
Reader.read_until_image_data
.One solution to problem 2 would be to (brainstorming quality, please bear with me):
Reader.seek_to_frame(&mut self, frame_index: usize) -> Result<&FrameControl, DecodingError>
method that is hidden behindwhere R: Seek
IDAT
is not part of the animation, then there is no way to seek back toIDAT
.IDAT
orfdAT
chunk and would need to restore the related state inStreamingDecoder
.State::ImageData(...)
orState::ApngSequenceNumber
(this requires remembering and restoringcurrent_chunk
-type
,crc
,remaining
State::U32 { kind: U32ValueKind::Length }
(no need to remember chunk contents; requires seeking to a slightly earlier offset)Reader.read_until_image_data
to store frame offsets if possibleoffset = None
whenR: !Seek
(maybe by introducing a private trait that is implemented forRead + !Seek
and forRead + Seek
)FrameControl
fdAT
/fcTL
sequence number (can add a crate-internal accessor toStreamingDecoder
? or maybe weDecoded::ChunkBegin
should be replaced withDecoded::DataChunkBegin(opaque_state_to_restore)
?)I think that if we have
seek_to_frame
API, then we would still need to makeread_until_image_data
public (to support moving to the next fcTL when the input stream doesn't implementSeek
).I am not sure if there are other, better ways to solve problem 2.
libpng
andStreamingDecoder
use a push model (slices with additional data is pushed into their APIs) andReader
uses a pull model (callingReader
APIs triggersRead.read
from the underlying data stream). In a push model the client is responsible for managing seeking/offsets. But a push model may still have a state that needs to be restored (I am not sure how exactly that works inlibpng
).