grimbough / rhdf5

Package providing an interface between HDF5 and R
http://bioconductor.org/packages/rhdf5
61 stars 22 forks source link

h5read returns data of type "raw" in R #120

Closed poglio closed 10 months ago

poglio commented 1 year ago

Hi everyone, I'm using h5read for reading VIIRS/NASA satellite images (VNP10A1F snow products, attached).

With the old version of rhdf5 package (2.30.1) everything works fine, that is the function returns an object of class "matrix" (3000x3000) and type "integer", that I can easily convert in raster().

With the current version (2.38.1) the function returns an object of class "matrix" "vector" and type "raw", if I try as.integer() the values are coerced but the object inherits only the class "vector" than the conversion to raster() crashes.

The solution is something like: viirs <- matrix(data=as.integer(viirs), nrow=3000, ncol=3000)

...but it would be nice if h5read worked like the previous package version

Thanks in advance for your attention cheers paolo

################################################################### Code for testing: library(rhdf5) library(raster)

read file

sds <- h5read('/path/to/.h5file', '/HDFEOS/GRIDS/NPP_Grid_IMG_2D/Data Fields/VNP10A1_NDSI_Snow_Cover') class(sds) typeof(sds) str(sds)

this crashes

sds_t <- t(sds) sds_r <- raster(sds_t, crs = '+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ')

this works

sds_fix <- matrix(data=as.integer(sds),nrow=3000, ncol=3000) sds_t <- t(sds_fix) sds_r <- raster(sds_t, crs = '+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ') plot(sds_r)

###################################################################

VNP10A1F.A2023038.h18v04.001.2023039122556.zip

grimbough commented 1 year ago

I made this change because I think it makes sense that if you write an R raw to HDF5 you want to use an unsigned 8-bit integer datatype. I also want that if you write something using rhdf5 and then read it again, you should get back exactly what you wrote. Hence reading an unsigned 8-bit integer returns a raw. Given this, I'm afraid I don't think I will revert the behaviour to its old state.

One thing you can do, rather than creating a new matrix, is to simply change the storage.mode to "integer". That should retain the dimensions etc automatically. e.g.

library(rhdf5)
library(raster)
#> Loading required package: sp

sds <- h5read(file = "/tmp/VNP10A1F.A2023038.h18v04.001.2023039122556.h5", 
              name = "/HDFEOS/GRIDS/NPP_Grid_IMG_2D/Data Fields/VNP10A1_NDSI_Snow_Cover")

class(sds) <- "integer"

sds_t <- t(sds)
sds_r <- raster(sds_t, 
                crs = '+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs ')
plot(sds_r)

jHmBcmhqOQAAAABJRU5ErkJggg==