Closed tejaspatel1990 closed 9 months ago
@luben : Can you please help
Is there some data lost/truncated when sending it through Kafka to another machine?
@luben : Nope. I confirmed that by checking the base64 string on both source and consumer.
Hmm, that strange. What happens when you decode the base64 and pass it to the zstd CLI, e.g.
$ cat BASE64.txt | base64 -d | zstd -d -
Also, what is the version of Kafka you are using? I think there were some recent changes around how it works with zstd compression. May be there is a bug in the new code.
@luben I don't think we have problem with Kafka. This is the flow Producer(Compression happens) > Websocket server > Kafka > Consumer(Decompression) In order to narrow down the issue I did de-compression on Websocket server as well and it gives me error, while the same data was able to decompress on producer.
Hmm, that strange. What happens when you decode the base64 and pass it to the zstd CLI, e.g.
$ cat BASE64.txt | base64 -d | zstd -d -
@luben : This give me error : zstd: /stdin\: unexpected end of file in CLI. Can you please tell me the correct command
Hmm, that strange. What happens when you decode the base64 and pass it to the zstd CLI, e.g.
$ cat BASE64.txt | base64 -d | zstd -d -
@luben : This give me error : zstd: /stdin: unexpected end of file in CLI. Can you please tell me the correct command
@luben : Made this work. When incorrect data is passed to CLI, it says following zstd: /stdin\: unsupported format
While when correct data, CLI decompression is successful.
That means that data is corrupted.
Command I tried : cat test.txt | base64 -d | zstd -d | cat
yes, so why the data is corrupted?
BTW, I didn't need the final cat
, may be some small difference:
$ cat test.zst.base64| base64 -d - | zstd -d -
1234
@luben : I tried it and I get this for corrupted data :
Warning : compression level higher than max, reduced to 19
zstd: /*stdin*\: unsupported format
For right data, I am able to decompress it and can get see the original data.
Can machine or OS, be creating this problem? We are using kubernetes and my observation is usually when this error occurs for all the request coming from a specific pod. However all the pods are created from the same base image.
I am not sure why, looks some corruption on transit
Here is the thing, after compression I do encryption and print a base 64 string of the encrypted array. The same base64 string I am able to match it my consumer.
In my consumer before I do decompression, I do decryption. There are no errors in decryption. If the data was getting corrupted while transporting, decryption won't be successful and the base64 string won't be matching.
I am now suspecting it something related to GLIBC libarary on the machine it is getting compressed because on the same machine I am able to decrypt and decompress.
can you check the the base64 of the payload before encryption by the sender and after decryption by the receiver?
I have very strange issue which I am facing. Following are its detail
My data is compressed using ZSTD compression and these compressed bytes are sent over to kafka over the network. When I read these bytes and try to decompress it gives me an error : com.github.luben.zstd.ZstdException: Unknown frame descriptor
To narrow down the issue,I de-compressed the data on the same source machine where compression is happening, I don't see any error. However the same data when I try to de-compress on a different machine it gives me error.
Interestingly I don't get this error for all of my data. It happens for some of the data. Here the source machine on which the compression happens is different than the consumer machine on which decompression is happening.
A pattern observed is : For all the data where I get this error, when I try to get decompressedSize like this - Zstd.decompressedSize(compressedData), it always gives me -2. However for the data where decompression is successful , I get correct length of the orginial data.
Can anyone please help asap on this?