hunzikp / velox

https://hunzikp.github.io/velox/
119 stars 23 forks source link

Velox does not preserve data type when converting to RasterStack #2

Closed mem48 closed 7 years ago

mem48 commented 7 years ago

When transferring data from Velox to Raster package the data type is not maintained. E.g. integer values become floats. This increases the file size of the resultant raster.

Simple example:

mat <- matrix(1:9, 3, 3) class(mat[1,1]) "integer" vx <- velox(mat, extent=c(0,3,0,3), res=c(1,1), crs="+proj=longlat +datum=WGS84 +no_defs") rs <- vx$as.RasterStack() dataType(rs) "FLT4S"

hunzikp commented 7 years ago

You're confusing two concepts: The storage mode of the data, and the datatype attribute of the raster object. velox typically preserves the storage mode of the data, even if you cast to a matrix or a raster object. To continue your example:

mat <- matrix(1:9, 3, 3)
class(mat[1,1])
# "integer"
vx <- velox(mat, extent=c(0,3,0,3), res=c(1,1), crs="+proj=longlat +datum=WGS84 +no_defs")
class(vx$as.matrix()[1,1])
# "integer"
class(as.matrix(vx$as.RasterLayer())[1,1])
# "integer"

The raster package uses float32 as the default data type (unless the default was changed using raster::rasterOptions), regardless of the storage mode of the actual data. E.g.

mat <- matrix(rep(1L, 9), 3, 3)
storage.mode(mat)
# "integer"
ras <- raster(mat)
dataType(ras)
# "FLT4S"

Consequently, the fact that the raster object you get from vx$as.RasterStack() has data type float32 is due to the raster package, rather than velox. Nonetheless, I've implemented an extra argument that allows assigning appropriate data types to raster objects created from velox rasters:

mat <- matrix(1:9, 3, 3)
vx <- velox(mat, extent=c(0,3,0,3), res=c(1,1), crs="+proj=longlat +datum=WGS84 +no_defs")
ras <- vx$as.RasterLayer(assign_data_type = TRUE)  # New argument
dataType(ras)
# "INT1U"

Note, however, that the raster::writeRaster function doesn't automatically respect the datatype attribute. You still need to do

writeRaster(x = ras, filename = "/foo/bar.tif", datatype=dataType(ras))

Alternatively, you could just use vx$write(), which always uses the smallest possible data type when writing to disk.