weidai11 / cryptopp

free C++ class library of cryptographic schemes
https://cryptopp.com
Other
4.66k stars 1.47k forks source link

Loss of data after inflation using gunzip #1245

Open lmille77 opened 8 months ago

lmille77 commented 8 months ago

I am running Crypto++ (master branch) on Xcode (14.3.1) with an iOS (16.6) app. I am attempting to isolate an issue where Crypto++'s gunzip is not inflating a properly formatted JSON file correctly for me. I am providing an already compressed JSON file whose original size is 1.3 MB. After inflation, I end up with 1.01 MB of data. My issue is that I cannot isolate why I am losing data during the inflation. The compression algorithm used is a multi-threaded compression algorithm adhering to the gzip formatting standard. I can successfully inflate this same compressed JSON file using gzip on OSX, Linux, Windows, but not on iOS using gzip inside the Crypto++ framework.

The snippet of code doing the inflation is:

std::string GZipUtils::decompressData(const std::vector<uint8_t> &bytes) {
        std::string decompressed;

        CryptoPP::Gunzip unzipper(new CryptoPP::StringSink(decompressed));
        unzipper.Put((CryptoPP::byte *) bytes.data(), bytes.size());
        unzipper.MessageEnd();

        return decompressed;
}

I would like to note that no errors are being thrown and the Put function returns 0 after the API call, which indicates all bytes were processed. We originally were using the Put2 API call, which produces the same result as Put. It appears Put is calling Put2.

I have attached the compressed and uncompressed JSON files.

uncompressed.json compressed.json.gz

noloader commented 8 months ago

Thanks @lmille77.

This makes me cringe. Can you perform a quick test, please?

git clone https://github.com/weidai11/cryptopp cryptopp-5.6.3
cd cryptopp-5.6.3
git checkout CRYPTOPP_5_6_2_clean

And then check the result.

The significance of 5.6.2 is, it is the last version that Wei Dai worked on. We touched the Zip/Unzip classes after Wei turned the project over to the community. I want to make sure we did not break it. And the "clean" part means the library mostly compiles cleanly under modern Clang and GCC.

lmille77 commented 8 months ago

@noloader,

I pulled down that branch. When I built master for iOS previously, I ran setenv-ios.sh that was inside of TestScripts after setting the IOS_SDK and IOS_CPU. Then, I ran GNUmakefile-cross. I noticed these files aren't included in the CRYPTOPP_5_6_2_clean branch. I tried going through the configuration for that branch using just the GNUmakefile. However, XCode complained that it built for MacOS, and not iOS when I went to run the application.

I want to spend some time walking through the source code to see if I can identify anything that may be causing this behavior. I will report back if I find anything.

lmille77 commented 8 months ago

I have ran more tests against the master branch. I used pigz as the compression algorithm and the gunzip was able to successfully inflate the data.