luben / zstd-jni

JNI binding for Zstd
Other
809 stars 165 forks source link

ZSTD : Unknown Frame Descriptor error #267

Closed tejaspatel1990 closed 9 months ago

tejaspatel1990 commented 1 year ago

I have very strange issue which I am facing. Following are its detail

My data is compressed using ZSTD compression and these compressed bytes are sent over to kafka over the network. When I read these bytes and try to decompress it gives me an error : com.github.luben.zstd.ZstdException: Unknown frame descriptor

To narrow down the issue,I de-compressed the data on the same source machine where compression is happening, I don't see any error. However the same data when I try to de-compress on a different machine it gives me error.

Interestingly I don't get this error for all of my data. It happens for some of the data. Here the source machine on which the compression happens is different than the consumer machine on which decompression is happening.

A pattern observed is : For all the data where I get this error, when I try to get decompressedSize like this - Zstd.decompressedSize(compressedData), it always gives me -2. However for the data where decompression is successful , I get correct length of the orginial data.

Can anyone please help asap on this?

tejaspatel1990 commented 1 year ago

@luben : Can you please help

luben commented 1 year ago

Is there some data lost/truncated when sending it through Kafka to another machine?

tejaspatel1990 commented 1 year ago

@luben : Nope. I confirmed that by checking the base64 string on both source and consumer.

luben commented 1 year ago

Hmm, that strange. What happens when you decode the base64 and pass it to the zstd CLI, e.g.

$ cat BASE64.txt | base64 -d | zstd -d -
luben commented 1 year ago

Also, what is the version of Kafka you are using? I think there were some recent changes around how it works with zstd compression. May be there is a bug in the new code.

tejaspatel1990 commented 1 year ago

@luben I don't think we have problem with Kafka. This is the flow Producer(Compression happens) > Websocket server > Kafka > Consumer(Decompression) In order to narrow down the issue I did de-compression on Websocket server as well and it gives me error, while the same data was able to decompress on producer.

tejaspatel1990 commented 1 year ago

Hmm, that strange. What happens when you decode the base64 and pass it to the zstd CLI, e.g.

$ cat BASE64.txt | base64 -d | zstd -d -

@luben : This give me error : zstd: /stdin\: unexpected end of file in CLI. Can you please tell me the correct command

tejaspatel1990 commented 1 year ago

Hmm, that strange. What happens when you decode the base64 and pass it to the zstd CLI, e.g.

$ cat BASE64.txt | base64 -d | zstd -d -

@luben : This give me error : zstd: /stdin: unexpected end of file in CLI. Can you please tell me the correct command

@luben : Made this work. When incorrect data is passed to CLI, it says following zstd: /stdin\: unsupported format

While when correct data, CLI decompression is successful.

That means that data is corrupted.

tejaspatel1990 commented 1 year ago

Command I tried : cat test.txt | base64 -d | zstd -d | cat

luben commented 1 year ago

yes, so why the data is corrupted? BTW, I didn't need the final cat, may be some small difference:

$ cat test.zst.base64| base64 -d - | zstd -d -
1234
tejaspatel1990 commented 1 year ago

@luben : I tried it and I get this for corrupted data :

Warning : compression level higher than max, reduced to 19 
zstd: /*stdin*\: unsupported format 

For right data, I am able to decompress it and can get see the original data.

Can machine or OS, be creating this problem? We are using kubernetes and my observation is usually when this error occurs for all the request coming from a specific pod. However all the pods are created from the same base image.

luben commented 1 year ago

I am not sure why, looks some corruption on transit

tejaspatel1990 commented 1 year ago

Here is the thing, after compression I do encryption and print a base 64 string of the encrypted array. The same base64 string I am able to match it my consumer.

In my consumer before I do decompression, I do decryption. There are no errors in decryption. If the data was getting corrupted while transporting, decryption won't be successful and the base64 string won't be matching.

I am now suspecting it something related to GLIBC libarary on the machine it is getting compressed because on the same machine I am able to decrypt and decompress.

luben commented 1 year ago

can you check the the base64 of the payload before encryption by the sender and after decryption by the receiver?