luben / zstd-jni

JNI binding for Zstd
Other
809 stars 165 forks source link

Different results when compressing direct vs stream #265

Closed rheia777 closed 1 year ago

rheia777 commented 1 year ago

Hello,

I was trying the library out and found it weird that compressing via a stream produces a different result. I've done it in a similar way with other compression libraries and it produced the same result.

Is there something I missed, when using a stream with ZSTD?

Example code:

import java.util.Base64;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import com.github.luben.zstd.Zstd;
import com.github.luben.zstd.ZstdOutputStream;

public class Main
{
    public static void main(String[] args)
    {
        try
        {
            String test = "testttestestestesttttttesttestestesttestestest";

            byte[] direct = Zstd.compress(test.getBytes(), -1);         

            ByteArrayOutputStream stream = new ByteArrayOutputStream();
            ZstdOutputStream zos = new ZstdOutputStream(stream, -1);
            InputStream is = new ByteArrayInputStream(test.getBytes());
            int read = is.read();
            while(read > -1)
            {
                zos.write(read);
                read = is.read();
            }
            zos.close();

            System.out.println(Base64.getEncoder().encodeToString(direct));
            System.out.println(Base64.getEncoder().encodeToString(stream.toByteArray()));
        }
        catch(Exception e)
        {
            e.printStackTrace();
        }
    }
}

Output

KLUv/SAu9QAAoHRlc3R0dGVzZXN0ZXN0ZXN0ZXN0AwA4DcENbo4L
KLUv/QBI9QAAoHRlc3R0dGVzZXN0ZXN0ZXN0ZXN0AwA4DcENbo4L

As you can see the result is almost, but not quite, the same.

luben commented 1 year ago

Yes, with direct compression we know the size of the payload, so that is embedded in the frame header. With streaming we don't know in advance the size of the payload. So if you look, the difference is in the first few bytes - where the frame header is.