JetBrains-Research / npy

NPY and NPZ support for the JVM
MIT License
54 stars 8 forks source link

Compare npy read/write speed with hdf5 for single thread mode #1

Closed olegs closed 8 years ago

superbobry commented 8 years ago

Preliminary results: npy is at least 3 time slower than hdf5 on mid-size double arrays:

Benchmark          (size)   Mode  Cnt    Score     Error  Units
NpyBenchmark.hdf5   10000  thrpt   10  345.967 ± 168.963  ops/s
NpyBenchmark.npy    10000  thrpt   10   97.036 ±   4.794  ops/s

Will investigate further.

superbobry commented 8 years ago

Results of the same benchmark using nio2 are available in dab3599284783891834b4a535de3ba57ee6934fe.

superbobry commented 8 years ago

More benchmarks with different array sizes:

Benchmark  (size)   Mode  Cnt      Score     Error  Units
hdf5Read     1000  thrpt   50   1036.767 ±  10.819  ops/s
hdf5Read    10000  thrpt   50   1017.166 ±  17.108  ops/s
hdf5Read   100000  thrpt   50    991.907 ±  25.826  ops/s
hdf5Write    1000  thrpt   50    498.436 ±  16.182  ops/s
hdf5Write   10000  thrpt   50    486.166 ±   7.077  ops/s
hdf5Write  100000  thrpt   50    415.408 ±  27.642  ops/s
npyRead      1000  thrpt   50  10211.874 ± 171.702  ops/s
npyRead     10000  thrpt   50  10145.648 ± 167.145  ops/s
npyRead    100000  thrpt   50  10252.037 ± 183.463  ops/s
npyWrite     1000  thrpt   50   4457.897 ± 195.010  ops/s
npyWrite    10000  thrpt   50    464.274 ±  24.172  ops/s
npyWrite   100000  thrpt   50     77.115 ±   4.195  ops/s

Interestingly, the write performance of HDF5 (almost) does not depend on the size of the array.

olegs commented 8 years ago

Wow, on large arrays, npyWrite is up to 6 times slower.

superbobry commented 8 years ago

Separate write benchmarks with and without compression:

Benchmark              (size)   Mode  Cnt    Score    Error  Units
hdf5Write               10000  thrpt   30  480.687 ± 10.709  ops/s
hdf5Write              100000  thrpt   30  410.524 ± 25.179  ops/s
hdf5Write             1000000  thrpt   30  126.299 ±  1.020  ops/s
npyWriteCompressed      10000  thrpt   30  282.381 ± 10.143  ops/s
npyWriteCompressed     100000  thrpt   30   28.524 ±  0.159  ops/s
npyWriteCompressed    1000000  thrpt   30    2.634 ±  0.031  ops/s
npyWriteUncompressed    10000  thrpt   30  500.945 ± 23.607  ops/s
npyWriteUncompressed   100000  thrpt   30   95.289 ±  6.860  ops/s
npyWriteUncompressed  1000000  thrpt   30   10.768 ±  0.339  ops/s

Analysis: uncompressed write are (unsurprisingly) faster than compressed, but for large arrays HDF5 still wins by a factor of 10. Overall this looks OK for our use-case, since we're more interested in read performance which is on par (and even faster) than HDF5 and doesn't suffer from global locking.