luben / zstd-jni

JNI binding for Zstd
Other
853 stars 168 forks source link

Zstd is 3 times slower than snappy #274

Closed tomerr90 closed 1 year ago

tomerr90 commented 1 year ago

I ran the following JMH on Java 8 and 11:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(
    value = 1,
    jvmArgs = {
    "-XX:-TieredCompilation",
    "-Xms16g",
})
public class ArrayStats {

    private ZstdCompressCtx zstdC = new ZstdCompressCtx();

    private static final byte[] s = ("dfdsfsfdsfkdlkfjvnc,mnvdshfskfjdlkfjdlfksald;lsakd;lasjfdsf;ldskf;ldsjg;ldsf;ldsm" +
        "kdjfslkdfjlkdsjfldskjfldsflkdsjg,mwewqpirpewoite[yo45p6o45/.45,n6.45m6ij45 o45 45lk;45lm 6;45l6l45h 45kn6;45l 6;l4j6" +
        " lk35l43j5; lk'45l7465,85'6;,';2l4p132[4i134i3ok5437o645i45ti[4pktlmreg'lre'rekt'ewkr';ewqkr[p324[32i5ewllewmf';ew" +
        "32k45lk43n543;l5m43';6'45l745;l765m7;lk65n7lk65h7k34n532n4j32v4j32h432h4;lk324j;32j432;432lj4'l32j4'23k4'32;k4';32" +
        "'dlfdfdslkjfoidsewrewpoir[p32i4[324[324m32m4;l32432oj543mn43,nr;ek[pckpsalc[]salc[psakf[o32j32r;lmd'wlm[pkwqrk" +
        "fffffffffffffffffffffffffffffffffffffffffff'l32k4'32k5;43lj6;45j6';5';f',dvds.v'dsmf'dskfas;lkd';kaa[pewqor]ewot][ewrglfd" +
        "FDLKGFLDM/.DV'SKF'EW;KR[P325P45][45P7][65765;L7';65K7';45K6'K56';54L645L.FD/,DSVSLDF'LDFWQKEKREW;RJ;EWFDSF" +
        "DSFDSFDS;LGKFD;LKG;FDLKG';LDF;LDFIREE3I43543054;L5J;L43J5;L43K5;L43K5;L43K5;L43K5;L43K5;43LK5;L43K5;L43K5;L43K5" +
        "435.,43/.5,/.43,5/.43,/.5,435/43,'5K';LK'DFK[DSFI[DSOF[OVCI-0EFIG-0WEIF-0EWREWKREWR'EWKR'EWKF'K';KWE';RK';EWKR';EW")
            .getBytes();

    @Benchmark
    public void zstd() {
        zstdC.compress(s);
    }

    @Benchmark
    public void snappy() throws IOException {
        Snappy.compress(s);
    }

    @Setup(Level.Trial)
    public void setup() {
        zstdC.setLevel(3);
    }

    public static void main(String[] args) throws RunnerException {
        Options options = new OptionsBuilder()
            .include(ArrayStats.class.getSimpleName())
            .build();

        new Runner(options).run();
    }
}

On two different machines, this is the output from one of them:

Benchmark          Mode  Cnt      Score      Error  Units
ArrayStats.snappy  avgt    5   3509.690 ?  109.142  ns/op
ArrayStats.zstd    avgt    5  11631.283 ? 1541.375  ns/op

Used the follwing:

        <dependency>
            <groupId>com.github.luben</groupId>
            <artifactId>zstd-jni</artifactId>
            <version>1.5.1-1</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.xerial.snappy/snappy-java -->
        <dependency>
            <groupId>org.xerial.snappy</groupId>
            <artifactId>snappy-java</artifactId>
            <version>1.1.9.0</version>
        </dependency>

In the other machine it was the same trend. According to the Zstd Github benchmarks it should be around the same performance. Am I missing something?

luben commented 1 year ago

Yes, that's expected. Zstd is not designed to compete with Snappy and LZ4, it's designed to compete with GZip and alike. If you want faster compression with Zstd you can use lower levels, e.g. -3, at the expense of compression ratio.

luben commented 11 months ago

If you are comparing to https://github.com/facebook/zstd#benchmarks - it indicates indeed that you need to use setLevel(-3) (in the CLI it's expressed as --fast=3)