rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
540 stars 89 forks source link

merge function results in unnecessarily huge output #1366

Open HassanMasoomi opened 11 months ago

HassanMasoomi commented 11 months ago

Hi,

I have several tiles and I need to merge them; but using terra's merge function will result in a ridiculously huge output file while using raster's merge function behaves very well and the output makes sense. In my real case, the output should be around 7GB but when I use terra's merge function, it results in an output file of ~120GB!!!

I managed to provide a reproducible example here. In this example, I have 80 tiles of ~6.5MB (in total ~520MB). Using terra, the merged file is ~4GB (!!!) while using raster results in a merged file of ~850MB (which is what expected).

Interestingly, if I just rewrite the result from terra, it will be ~850MB which totally would makes sense.

library(terra)
# terra 1.7.55

# creating 80 tiles of 1x1 out of 121 tiles for the region
n <- 80
allComb <- expand.grid(lon = seq(-100, -90, by = 1), lat = seq(40, 50, by = 1))
allComb <- allComb[sample.int(nrow(allComb), n),]
for (i in 1:n) {
  writeRaster(rast(xmin=allComb[i, "lon"], xmax=allComb[i, "lon"]+1, 
                   ymin=allComb[i, "lat"], ymax=allComb[i, "lat"]+1, res=0.0001, vals=i), 
              paste0("part_", i, ".tif"), overwrite = TRUE)
}

# merge tiles using terra
allParts <- list.files(pattern = "^part_.+\\.tif", full.names = TRUE)
all_terra <- merge(sprc(allParts), first = TRUE, na.rm = TRUE, 
                   filename = "all_terra.tif", overwrite = TRUE)

# merge tiles using raster
allParts <- list.files(pattern = "^part_.+\\.tif", full.names = TRUE)
allParts <- lapply(allParts, raster::raster)
allParts$filename <- "all_raster.tif"
all_raster <- do.call(raster::merge, allParts)

# rewrite the terra output
writeRaster(all_terra, filename = "all_terra_rewrite.tif", overwrite = TRUE)
rhijmans commented 11 months ago

I see

file.info("all_terra.tif")$size / 1024^2
#[1] 901.186
file.info("all_terra_rewrite.tif")$size / 1024^2
#[1] 261.0031

describe("all_terra.tif")[c(37, 45)]
#[1] "  COMPRESSION=LZW"                                 
#[2] "Band 1 Block=1100x1 Type=Float32, ColorInterp=Gray"
describe("all_terra_rewrite.tif")[c(37, 45)]
#[1] "  COMPRESSION=LZW"                                 
#[2] "Band 1 Block=1100x1 Type=Float32, ColorInterp=Gray"

It seems that the compression is not working very well when using merge the way that terra implements it.