nodeca / pako

high speed zlib port to javascript, works in browser & node.js
http://nodeca.github.io/pako/
MIT License
5.57k stars 790 forks source link

Binary gzip string not in correct gzip format #54

Closed patrickliechty closed 9 years ago

patrickliechty commented 9 years ago

var output = pako.gzip("{test:'hello world'}", { to: 'string' });

fs.writeFile("./gzip-test.txt", output, function(err) {
  if(err) {
    return console.log(err);
  }

  console.log("The file was saved!");
});

Produces this output in a file:

1F C2 8B 08 00 00 00 00 00 00 03 C2 AB 2E 49 2D 2E C2 B1 52 C3 8F 48 C3 8D C3 89 C3 89 57 28 C3 8F 2F C3 8A 49 51 C2 AF 05 00 37 C2 90 12 C3 89 14 00 00 00

According to this:

http://en.wikipedia.org/wiki/Gzip

The output should start with a 1F 8B. It has a C2 in there, otherwise it looks good. I am trying to gzip a big block of JSON in the browser and send it to the backend. The Java backend is throwing an error saying that this is not in GZip format.

puzrin commented 9 years ago
var output = require('pako').gzip("{test:'hello world'}", { to: 'string' });
console.log(output.charCodeAt(0).toString(16), output.charCodeAt(1).toString(16)); 

1f 8b

It's your bug in file write encoding.

patrickliechty commented 9 years ago

I had the same issue when trying to send the data to the server from the browser. I fixed it by doing this:

    headers["Content-Encoding"] = "gzip";
    var gzippedJson = gzip.gzip("{test:'hello world'}");
    //If you use charset=x-user-defined-binary it just sends the data through as is.
    //  If you don't do this, it uses utf-8 which adds extra characters and thus is not in gzip format
    var gzippedBLob = new Blob([gzippedJson], {type: "application/json; charset=x-user-defined-binary"});
   xhr.send(gzippedBlob);
puzrin commented 9 years ago

Ok, but that's not related to this package anyhow - pako's job is to produce arrays or binary strings.

patrickliechty commented 9 years ago

Yes, but it will help those who are trying to use this library. I guess this belongs in a forum an not in the issues.

gandhirajan commented 8 years ago

@patrickliechty Hi, I m trying out the exact same thing and i m stuck. I m using jquery ajax to post the gzipped string to server but not able to ungzip it on server. Could you brief me the params that needs to be set in AJAX calls to make it work?

puzrin commented 8 years ago

@gandhirajan JS strings are UTF16 (2 bytes per char). You should convert it to binary format first (1 byte per element, utf8 for example)

gandhirajan commented 8 years ago

@puzrin Hi Puzrin, thanks for your response. I tried that but dint work. Do i need to do some encoding while sending the gzipped string from jquery AJAX to server?

gandhirajan commented 8 years ago

@puzrin We managed to resolve this issue. The issue was with encoding while sending the gzipped string from Javascript to Server. window.btoa() in javascript and base64 decoding at server side resolved the issue

puzrin commented 8 years ago

That's not the most effective way, but possible.

I'd suggest to use utf8 instead of base64. Something like this http://ecmanaut.blogspot.ru/2006/07/encoding-decoding-utf8-in-javascript.html (but note that built-in methods can throw exceptions on invalid char sequences).

However i did not compared sizes after different methods use.

gandhirajan commented 8 years ago

Hi Puzrin, Thanks again for your suggestion. But if i try out the following as mentioned in the link above: unescape(encodeURIComponent(string)) . I m unable to decode the same in server side as URLDecoder.decode fails due to non ASCII characters.

puzrin commented 8 years ago

@gandhirajan I've added working example https://github.com/nodeca/pako/tree/master/examples with both browser and server code. It's not for production but to demonstrate reencoding steps for data flow. See comments.

gandhirajan commented 8 years ago

@Puzrin, Thanks a lot for your response. But was looking for a JAVA sample on server side which i m still figuring out. (Sending gzipped response from server side as AJAX response)

blacelle commented 6 years ago

Hello, I allow myself resurrecting this thread. Here is a Java snippet to handle pako String:

// Convert from pako String format to raw byte[]
byte[] asByteArray = new byte[html.length()];
for (int i = 0; i < html.length(); i++) {
    asByteArray[i] = (byte) html.charAt(i);
}
// Unzip the byte[]
try (GZIPInputStream gzipInputStream = new GZIPInputStream(new ByteArrayInputStream(asByteArray));
        InputStreamReader osw = new InputStreamReader(gzipInputStream, StandardCharsets.UTF_8)) {
    html = CharStreams.toString(osw);
}

Pako provided the string with:

data.html = pako.gzip(data.html, { to: 'string' });
dboldureanu commented 6 years ago

For me, usefull was patrickliechty's example. In angular, used this:

` const gzip = pako.gzip(jsonString); const blob = new Blob([gzip]);

    const headers = new Headers({
        'Content-Type': 'application/json; charset=x-user-defined-binary',
        'Content-Encoding': 'gzip'
    });

    const reqOptions = new RequestOptions({ headers: headers });

    return this.http.put('URL', blob, reqOptions)
        .map(this.extractJSON)
        .catch((err) => this.httpErrorHandler.handleError(err));

`

vras8213 commented 1 year ago

trying to decompress data of uint8Array getting error . const uncompressedData = inflate(uint8ArrayDe, { raw: true }) how can i fix ? The error is invalid block type