Closed niekverw closed 4 years ago
I've been doing some work on this and made some available in the rhdf5filters package
As of rhdf5 version 2.33.1 you can provide the filter
argument to h5createDataset()
to specify that one of the optional plugins found in rhdf5filters should be used when writing the data chunks e.g.
library(rhdf5)
h5createFile("ex_createDataset.h5")
#> [1] TRUE
h5createDataset("ex_createDataset.h5", dataset = "A",
dims = c(5,8), chunk = c(5,1),
filter = "BZIP2", level = 6)
#> [1] TRUE
h5write(matrix(1:40, nrow = 5, ncol = 8),
file = "ex_createDataset.h5", name = "A")
Reading datasets compressed with any of the supported filters should be transparent if rhdf5filters is installed
h5dump("ex_createDataset.h5")
#> $A
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1 6 11 16 21 26 31 36
#> [2,] 2 7 12 17 22 27 32 37
#> [3,] 3 8 13 18 23 28 33 38
#> [4,] 4 9 14 19 24 29 34 39
#> [5,] 5 10 15 20 25 30 35 40
Not really an issue, but I was wondering if it is possible to use other compression libs such as snappy bzip2 etc?
http://danielhnyk.cz/comparison-of-compression-libs-on-hdf-in-pandas/