Open MarByteBeep opened 9 months ago
You're passing a windowBits
parameter value of -15.
From the zlib documentation at https://www.zlib.net/manual.html:
The windowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8..15 for this version of the library. Larger values of this parameter result in better compression at the expense of memory usage. The default value is 15 if deflateInit is used instead.
Internally, a negative value of windowBits
is used to switch between gzip and zlib compression modes. We should catch negative values in opts.windowBits
and throw an error, but that's about all we can do.
That then creates a new problem: my original binary is compressed and starts with a78 DA
header, which is a perfectly valid and common zlib header. I need to be able to recompress said data and end up with that header again. The only setting that did that was windowBits -15. How am I able to recreate the original if you simply prevent that value from being passed?
Also, any idea why inflateSync
fails to decompress a zlib binary with said header, while gunzipSync
works? That feels a bit counterintuitive to me 😊
Internally, a negative value of
windowBits
is used to switch between gzip and zlib compression modes. We should catch negative values inopts.windowBits
and throw an error, but that's about all we can do.
I was entirely and completely wrong. My apologies.
The actual problem is that inflateSync calls zlib like this:
var reader = zlib.ZlibReaderArrayList.initWithOptions(compressed, &list, allocator, .{
.windowBits = -15,
}) catch |err| {
and you'd like to call it with .windowBits = 15
. So this almost looks like my favorite kind of bugfix, those that involve removing or changing a single character.
But it's not: the existing code is right for the "raw deflate" format that zlib produces. Per the guidance of the zlib manual, we should default to expecting a zlib header, and assume it is absent only when explicitly told so. But right now, there's no option for that. We could simply check whether there's a valid zlib header, but that would still have many false positives.
So, technically, everything is as it should be, except for the function names. inflateSync
should be inflateRawDeflatedDataSync
, and used very rarely. Inflating zlib data and decompressing gzip files can safely share a function, since in those very rare cases where you want to accept only one of those formats, it's an acceptable burden to just check for the gzip magic number yourself.
Of course there's no way to guess that without improved API documentation.
Again, I'm sorry I was wrong. In summary, there are three compressed formats only two of which are known to be disjoint. Both (may) overlap with the third.
ive found that this affects bun.report's ability to decompress some panic messages, though the node:zlib
polyfill does not have this issue (but is slower).
in addition to this issue, since Bun 1.1.21, more payloads dont seem to be decompressable:
expect(Bun.gunzipSync(Buffer.from('eNrLzCvLz05NUUguSizOcKoMSMzLTNbQVMhIzEvJSS0CAK/LCxc', 'base64url')).toString('utf8')).toBe('invoked crashByPanic() handler');
above:
What version of Bun is running?
1.0.26
What platform is your computer?
Microsoft Windows NT 10.0.22631.0 x64
What steps can reproduce the bug?
What is the expected behavior?
A binary compressed with
deflateSync
should always be decompressible withinflateSync
. When using the{ level: 9, windowBits: -15 }
settings, it is not.However it IS decompressible with
gunzipSync
.So we end up with the weird situation in which you have to use
gunzipSync
to decompress anddeflateSync
to compress.What do you see instead?
decompressed
is error: invalid stored block lengthsAdditional information
As per discord discussion, might be related to https://github.com/oven-sh/bun/pull/8529
Also I ran into: https://github.com/oven-sh/bun/issues/6401 as well, the docs seem to be incorrect.