other compression libs - Githubissues

niekverw commented 5 years ago

Not really an issue, but I was wondering if it is possible to use other compression libs such as snappy bzip2 etc?

http://danielhnyk.cz/comparison-of-compression-libs-on-hdf-in-pandas/

grimbough commented 4 years ago

I've been doing some work on this and made some available in the rhdf5filters package

grimbough commented 4 years ago

As of rhdf5 version 2.33.1 you can provide the filter argument to h5createDataset() to specify that one of the optional plugins found in rhdf5filters should be used when writing the data chunks e.g.

library(rhdf5)

h5createFile("ex_createDataset.h5")
#> [1] TRUE

h5createDataset("ex_createDataset.h5", dataset = "A", 
                dims = c(5,8), chunk = c(5,1), 
                filter = "BZIP2", level = 6)
#> [1] TRUE

h5write(matrix(1:40, nrow = 5, ncol = 8), 
        file = "ex_createDataset.h5", name = "A")

Reading datasets compressed with any of the supported filters should be transparent if rhdf5filters is installed

h5dump("ex_createDataset.h5")
#> $A
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,]    1    6   11   16   21   26   31   36
#> [2,]    2    7   12   17   22   27   32   37
#> [3,]    3    8   13   18   23   28   33   38
#> [4,]    4    9   14   19   24   29   34   39
#> [5,]    5   10   15   20   25   30   35   40

grimbough / rhdf5

other compression libs #34