lz4 / lz4-java

LZ4 compression for Java
Apache License 2.0
1.1k stars 253 forks source link

Stream corrupted on Java 1.8.0 (HotSpot) #75

Closed llogiq closed 7 years ago

llogiq commented 8 years ago

Tested with 1.3 and HEAD (as of November 19th 2015) on Linux with the following JDK:

$ java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

The exception (not that there's anything new here):

Exception in thread "main" java.io.IOException: Stream is corrupted
    at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:153)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:130)
    at lz4.ExtractTest.main(ExtractTest.java:14)

The code that calls it (the example is so minimal that I'm posting it here verbatim):

package lz4;

import java.io.*;
import java.nio.file.*;
import net.jpountz.lz4.LZ4BlockInputStream;

public class ExtractTest {
    public static void main(final String[] args) throws IOException {
        try(final InputStream in = new LZ4BlockInputStream(Files.newInputStream(Paths.get(args[0])))) {
            final byte[] bytes = new byte[10000];
            final int len = in.read(bytes);
            System.out.printf("len=%d%n", len);
        }
    }
}

Note that calling lz4 -t on the file given in args[0] returns successfully. The file in question is 613632000 bytes long and was packed with *** LZ4 Compression CLI 64-bits r122, by Yann Collet (Sep 18 2014) ***

gyscos commented 8 years ago

I think it's because LZ4BlockInputStream can only decode what LZ4BlockOutputStream wrote, as they are using a home-made frame format, and not the official one. See #21. As you can see on lz4 homepage, there is currently no inter-operable java binding. And it doesn't look like this one is getting updated soon.

gyscos commented 8 years ago

I'm working on adding this feature in Gyscos/lz4-java. Currently the decompressing part is in progress, using the new LZ4CompatibleInputStream.

I'll send a PR when I get the compression part ready. Open to suggestions for naming conventions, etc.

gyscos commented 8 years ago

Gyscos/lz4-java should be in a reasonable state. PR is here: #77.

EDIT: Just noticed PR #61 <_< You're probably better off using their higher-quality code instead.

odaira commented 7 years ago

I have merged LZ4-frame-compatible implementation from #61.