libjxl / libjxl

JPEG XL image format reference implementation
BSD 3-Clause "New" or "Revised" License
2.73k stars 263 forks source link

JxlDecoderReleaseJPEGBuffer always reports none of the buffer has been written to even when it has #2689

Open 190n opened 1 year ago

190n commented 1 year ago

Describe the bug JxlDecoderReleaseJPEGBuffer is supposed to return the number of bytes of the output buffer that have not yet been written to. In my use, it actually always returns the full size of the buffer, indicating that none of the buffer was written to, even though the buffer does actually contain a partial JPEG file (I verified with hashes that the full buffer I had allocated thus far matched that many bytes of the original JPEG).

To Reproduce

Expected behavior Either the buffer contains garbage and JxlDecoderReleaseJPEGBuffer returns the size of the buffer as it does currently (if it needs more room before it can do any work, or something like that), or some of the buffer has been written to (in practice it seems that all of it is) and JxlDecoderReleaseJPEGBuffer accurately reports how much.

Screenshots

I attached the C++ code I've been using to test this (GitHub wouldn't let me upload it with the right extension). It's heavily modified from the decode_oneshot example. You can see that libjxl reports it has not written anything (line 84 is the printf)...

$ clang++ -o decode_oneshot decode_oneshot.cc $(pkg-config --libs --cflags libjxl)
$ ./decode_oneshot.cc botw.jxl /dev/null /dev/null
[...]
4096 bytes were not written. 4096-byte buffer now holds 0 bytes of jpeg data
first few bytes of buffer: ffd8ffe14b894578
status 6
8192 bytes were not written. 8192-byte buffer now holds 0 bytes of jpeg data
first few bytes of buffer: ffd8ffe14b894578
status 6
16384 bytes were not written. 16384-byte buffer now holds 0 bytes of jpeg data
first few bytes of buffer: ffd8ffe14b894578

...but what is actually in the buffer corresponds to the JPEG file (and I verified this as far as 131072 bytes on this particular file, as the next time it resizes the buffer it actually fits the whole output):

$ djxl botw.jxl botw.jpg
$ hexdump -C botw.jpg | head -n1
00000000  ff d8 ff e1 4b 89 45 78  69 66 00 00 4d 4d 00 2a  |....K.Exif..MM.*|

decode_oneshot.txt botw jxl

Environment

Additional context

jonsneyers commented 1 year ago

The way this is implemented at the moment, is quite naive: it just starts writing until it runs out of output buffer or until it is done. If it is done, it says it is done, otherwise it says it has written nothing, and it will restart from scratch when you give it a larger buffer.

Ideally, we should make a more streaming/stateful version of this, that can actually write partial output and then resume writing.

It's not super critical imo though, since it's not that hard to guess how large the jpeg will be: typical recompression saves around 20%, so if you take a buffer that is 25% larger than the input jxl file (or say 50% larger just to be sure), it will very likely be large enough.

190n commented 1 year ago

Yes, I do already estimate an initial buffer size which usually ends up big enough. That said, it would still be helpful to decode in chunks because then I don't have to wait to decode the entire JXL file before returning it to the application; I can give it pieces of the JPEG incrementally (this is for a FUSE filesystem which converts JXL to JPEG on-the-fly).