lz4 / lz4-java

LZ4 compression for Java
Apache License 2.0
1.11k stars 252 forks source link

Source ByteBuffer gets tampered on Decompressing #68

Closed vaibhavhajela closed 6 years ago

vaibhavhajela commented 9 years ago

Hi Team, I am using LZ4 1.3.0 library to compress and decompress ByteBuffer. My sourcebuffer gets altered during decompression. //Compression int compressLen = compressor.compress(message, 0, decompressedLength, message, 0, maxCompressedLength);

//Decompression int decompressed = deCompressor.decompress(msg, 0,compressedLen, bufferMsg, 0, bufferMsg.capacity());

Actually my source buffer contains many compressed messages separated by Identifiers and I am decompressing them one by one in loop. First decompression works fine, but from second decompression, I start getting following error for all the subsequent messages "Error decoding offset 422 of input buffer".

However if i use a Temporary buffer as source buffer to decompress, decompression works fine for all the compressed messages. //Decompression using tempBuffer System.arraycopy(msg.array(),msg.arrayOffset(), tempMsg.array(),tempMsg.arrayOffset(), compressedLen); int decompressed = deCompressor.decompress(tempMsg, 0,compressedLen , bufferMsg, 0, bufferMsg.capacity());

It seems to me that the SourceBuffer to be decompressed gets somehow modified on Decompression.

Can anybody please help, How do i achive this decompression without using a temp Buffer.

jpountz commented 9 years ago

Could you let me know:

In addition, if you have a reproducible test failure, this would be great to understand what is happening here. :-)

vaibhavhajela commented 9 years ago

I really appreciate your quick response and willingness to help. I m using LZ4JNICompressor to compress and LZ4JNISafeDecompressor to decompress. I have also tried LZ4JNIFastDecompressor toi decompress but same result.

Caused by: net.jpountz.lz4.LZ4Exception: Error decoding offset 39 of input buffer at net.jpountz.lz4.LZ4JNIFastDecompressor.decompress(LZ4JNIFastDecompressor.java:66) ~[lz4-1.3.0.jar:na]

I also have tried decompression creating a temporary byteArray for byteArray to byteArray decompression , it also works fine. And yes we have a reproducing situation, i will assist you with any information.

jpountz commented 9 years ago

To confirm the issue is with the JNI impl, could you make sure the problem does not reproduce if you use one of the Java impls?

vaibhavhajela commented 9 years ago

Java impls ? I didnt get you. If i do the decompression using a intermediate bytearray, its works fine. If i do decompression using a temp ByteBuffer as a source ByteBuffer, it works fine. It appears that the source ByteBuffer gets altered while decompressing while decompressing directly.

Please feel free to ask if you want me to carry out any other tests .

jpountz commented 9 years ago

By Java impls, I meant using eg LZ4Factory.safeInstance instead of LZ4Factory.fastestInstance.

vaibhavhajela commented 9 years ago

I will test it with LZ4Factory.safeInstance and will let you know the result tomorrow.

vaibhavhajela commented 9 years ago

Hi Adrien, I tried using LZ4Factory.safeInstance for both compressor and decompressor, i got following error while decompressing. Surprisingly, with safest Instance I got error for the first message itself unlike fastestInstance where I used to get error for second messahge onwards. But when I tried using temporary buffer(copy of Source) as source buffer todecompress with safest Instance, it worked fine.

Caused by: net.jpountz.lz4.LZ4Exception: Malformed input at 15 at net.jpountz.lz4.LZ4JavaSafeSafeDecompressor.decompress(LZ4JavaSafeSafeDecompressor.java:81) ~[lz4-1.3.0.jar:na] at net.jpountz.lz4.LZ4JavaSafeSafeDecompressor.decompress(LZ4JavaSafeSafeDecompressor.java:116) ~[lz4-1.3.0.jar:na] at com.sungard.Core.Protocol.NGOProtocolWithCompression.deCompress(NGOProtocolWithCompression.java:154) ~[bin/:na] at com.sungard.Core.Protocol.NGOProtocolWithCompression.decomposeBuffer(NGOProtocolWithCompression.java:101) ~[bin/:na]

jpountz commented 9 years ago

If the safe impl fails too then I'm wondering that your code might be misusung the API, because it should really never modify the source content. However, if you use the compress/decompress methods that don't take offset, the offset of the byte buffers will be updated, maybe this is the cause of your issue? Can you try to build a reproducible unit test that demonstrates the problem?

vaibhavhajela commented 9 years ago

Hi Adrien,

I would try to give you a simple scenario which reproduces issue for me.

Lets say

  1. I have a byteBuffer with limit = y
  2. I am compressing this bytebuffer from offset x to offset y-1 into a temporary Bytebuffer.
  3. Then we copy the temporary ByteBuffer content to the Original Bytebuffer.
  4. We sent the Original ByteBuffer Over the network.
  5. Upon receiving the compressed bytebuffer , we try to decompress bytebuffer from offset x to offset (y-1), We get error "error decoding offset "
  6. However, if we copy the received bytebuffer into a temporary bytebuffer and then try to decompress the temporary bytebuffer from offset x to offset (y-1), decompression get successful.
  7. Alternatively, Even if we convert the received bytebuffer into a bytearray and try decompression on the bytearray, it works.

Using a temporary ByteBuffer is not a sustainable solution for me, Please see if you can help me in any way.

vaibhavhajela commented 9 years ago

Hi Adrien,

Strangely, Decompression works for me fine when i use the bytearray API of decompressor.

int returnVal = deCompressor.decompress(src.array(), src.arrayOffset() + srcOffset,srcLength, target.array(),target.arrayOffset()+ targetOffset, targetLength);

I have one more query,

I m getting quite good speed in Decompression around 1200MB/s. However, I m not getting good speed in compression around 100 MB/s Can you please suggest some ways to SPEED up COMPRESSION. I m using following code to compress : static LZ4Factory factory = LZ4Factory.fastestInstance(); public static LZ4Compressor compressor = factory.fastCompressor(); public static LZ4SafeDecompressor deCompressor = factory.safeDecompressor(); int returnVal = compressor.compress(src, srcOffset, srcLength, Target, targetOffset, targetLength);

fflatorre commented 8 years ago

Hi Guys,

Any update on this ? I'm experiencing exactly the same error :-/

Cheers, Francesco

fflatorre commented 8 years ago

The following test shows what actually fails :


public class DeCompressionTest {

private static LZ4Factory lz4Factory;
private static LZ4Compressor lz4Compressor;
private static LZ4FastDecompressor lz4Decompressor;

static {
    lz4Factory = LZ4Factory.fastestInstance();
    lz4Compressor = lz4Factory.fastCompressor();
    lz4Decompressor = lz4Factory.fastDecompressor();
}

@Test
public void testFastDeCompressionSpecialChars () {

    String toCompress = "2ο Χέρι. Τοποθετήστε το Στοίχημα Εδώ.";

    byte[] compressedBuffer = lz4Compressor.compress(toCompress.getBytes());
    byte[] decompressedBuffer = new byte[toCompress.length()];
    lz4Decompressor.decompress(compressedBuffer, decompressedBuffer);

    Assert.assertArrayEquals(toCompress.getBytes(), decompressedBuffer);
}

@Test
public void testDeCompressionStandardChars () {

    String toCompress = "This is a standard translation string";

    byte[] compressedBuffer = lz4Compressor.compress(toCompress.getBytes());
    byte[] decompressedBuffer = new byte[toCompress.length()];
    lz4Decompressor.decompress(compressedBuffer, decompressedBuffer);

    Assert.assertArrayEquals(toCompress.getBytes(), decompressedBuffer);
}

@Test
public void testSafeDeCompressionSpecialChars () {

    LZ4SafeDecompressor lz4Decompressor;
    lz4Factory = LZ4Factory.safeInstance();
    lz4Compressor = lz4Factory.fastCompressor();
    lz4Decompressor = lz4Factory.safeDecompressor();

    String toCompress = "2ο Χέρι. Τοποθετήστε το Στοίχημα Εδώ.";

    byte[] compressedBuffer = lz4Compressor.compress(toCompress.getBytes());
    byte[] decompressedBuffer = new byte[toCompress.length()];
    lz4Decompressor.decompress(compressedBuffer, decompressedBuffer);

    Assert.assertArrayEquals(toCompress.getBytes(), decompressedBuffer);
}

}

fflatorre commented 8 years ago

Hi,

I've just realised the test were bugged because of the following line :+1:

byte[] decompressedBuffer = new byte[toCompress.length()];

replacing with :

byte[] decompressedBuffer = new byte[toCompress.getBytes().length];

fixed the issue :)

vaibhavhajela commented 8 years ago

Hey thanks, I will try to check this solution to see if it works On Dec 10, 2015 4:17 PM, "fflatorre" notifications@github.com wrote:

Hi,

I've just realised the test were bugged because of the following line [image: :+1:]

byte[] decompressedBuffer = new byte[toCompress.length()];

replacing with :

byte[] decompressedBuffer = new byte[toCompress.getBytes().length];

fixed the issue :)

— Reply to this email directly or view it on GitHub https://github.com/jpountz/lz4-java/issues/68#issuecomment-163576835.

vaibhavhajela commented 8 years ago

Hi Adrien,

I was studying the LZ4 algorithm. I have a query, How do we identify if a LZ4 sequence has ended and next LZ4 sequence is started? Can you please answer the query, I will be really helpful to me.

On Thu, Dec 10, 2015 at 4:17 PM, fflatorre notifications@github.com wrote:

Hi,

I've just realised the test were bugged because of the following line [image: :+1:]

byte[] decompressedBuffer = new byte[toCompress.length()];

replacing with :

byte[] decompressedBuffer = new byte[toCompress.getBytes().length];

fixed the issue :)

— Reply to this email directly or view it on GitHub https://github.com/jpountz/lz4-java/issues/68#issuecomment-163576835.

Vaibhav Hajela

lyrachord commented 8 years ago

Hi @vaibhavhajela check the generated files: build\java\net\jpountz\lz4\LZ4JavaSafeFastDecompressor.java, of cause after the building. it's very clear about the lz4 sequence what you want. A simple description as follows: lz4compressed=[token-data][token-data].... token-data=(RunLength,MatchLegth,Offset)=[half-half][run length additional bytes....][data][short little endian offset][match length additional bytes...] if RunLength or MatchLength<15(half byte) there are no additional bytes followed The additional bytes used while(x-=255) loop, that's 255*times+remainder, Indeed this part can be replaced with other shceme, such as varint.

When decompressed sequence clear, then the compressor part is clear too.

the original document at here http://fastcompression.blogspot.in/2011/05/lz4-explained.html

odaira commented 7 years ago

Was the original problem of modified source ByteBuffer solved?

vaibhavhajela commented 7 years ago

Actually, i used a workaround at that time. And i didnt time to test it with the fixed version. I sincerely appreciate your efforts to help the users.

On May 26, 2017 9:15 PM, "Rei Odaira" notifications@github.com wrote:

Was the original problem of modified source ByteBuffer solved?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lz4/lz4-java/issues/68#issuecomment-304317103, or mute the thread https://github.com/notifications/unsubscribe-auth/AMKDjReq01r4NY7ewmiOnbVRy09THSXWks5r9vOzgaJpZM4E4NGl .

odaira commented 6 years ago

Let me close this issue now. If you encounter the same problem again, please reopen this with a reproducible test case.