Closed flippmoke closed 7 years ago
cc/ @sam-github
This seems to only be compression, not decompression.
zlib.gunzip(compressed, function(err, buffer) {
if (buffer.toString('hex') === uncompressed.toString('hex')) {
console.log("Decompression results are the same");
} else {
console.log("Decompression results are different");
}
});
v4.8.2
macOS
Decompression results are the same
Does zlib/gzip make any guarantees about always producing the same compressed output? It shouldn't be required as long as the compressed results can be decompressed by a compatible implementation to produce the same input.
I think this may be an upstream change.
In the new version of zlib, if __APPLE__
is defined (which it seems to be on our macOS builds) then OS_CODE
is 19
instead of 3
. Since this value is included in the gzip header, it would slightly change the resulting compression output.
Commented: https://github.com/nodejs/node/pull/10980/files#r110047791
That's my current theory, at least.
/cc @nodejs/lts
The change of version made of
header was made in https://github.com/madler/zlib/commit/ce12c5cd00628bf8f680c98123a369974d32df15 in order to follow the spec of https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT.
I don't think this is an issue because a compressed binary is platform dependent as it was already different between Linux and Windows and it does not affect compress/decompress functionaries.
@MylesBorins I'm not sure it would be worth the effort to float a patch for this on an lts branch.
@flippmoke is this causing an actual problem? Or is it just a change in behaviour that wasn't expected?
@rmg It is not an actual issue with out code base, we were just using node.js zlib to compare against our C++ implementation inside a node application (https://github.com/mapnik/node-mapnik) for regular testing. We have been building off the system zlib 1.2.8
in OSX so suddenly the results were not the same. I would have just figured it was a slight difference due to a new zlib version but the fact that linux results were still the same made me wonder if it was a bug.
We could change our testing techniques and perhaps and ignore the header of the gzip compressed binary when we compare the results.
Or perhaps don't compare the zipped data, but unzip it and compare that to an expected value? That would be robust against implementation.
@sam-github we could definitely do that, but part of the testing was to compare that all of our options that could be passed in for different configurations of zlib were all going in and creating reasonable results as well, so it really is nice to compare directly to another compressed result. I think if I can properly determine where the header of the zlib compression ends that I should be able to only test against the results after that point and achieve the same results.
I just saw another instance of this causing an inexplicable test failure and the only reason I knew the cause was because I had dug in to this issue, and even that was only possible because @flippmoke had already done the hard work of identifying that it was a gzip header change.
Spelling it out, I now realize how unreasonable it would be to expect the average node user to figure out what happened.
@nodejs/lts I take back my previous stance. This part of the zlib upgrade should be reverted as part of an LTS patch release.
@rmg could you please send a PR with the revert to both 4.x and 6.x?
@MylesBorins I am looking at this now. I'll PR a test and revert of the header change.
Given that #12404 was closed without merging, should this remain open? I strongly suspect 4.x is going to go EOL before this is fixed, but someone is welcome to prove me wrong by fixing it. :-D
should close, the PR was rejected.
It appears that after #10980, there are some issues with zlib specifically producing different results on OSX during gzip compression. This can be reproduced by the small test below, where there is a single byte difference in the zlib results. This same test does not produce different then expected results on Linux.
v4.8.1
OS-X
Results are the same
v4.8.1
Linux
Results are the same
v4.8.2
OS-X
Results are different
v4.8.2
Linux
Results are the same
I believe this also affects the v6 latest release, but have not tested it yet.