r-lidar / lidR

Airborne LiDAR data manipulation and visualisation for forestry application
https://CRAN.R-project.org/package=lidR
GNU General Public License v3.0
601 stars 131 forks source link

rasterize_terrain(res = SpatRaster) fails in parallel with future package #690

Closed mcoghill closed 1 year ago

mcoghill commented 1 year ago

So there are a lot of moving parts here and I'm not entirely sure if the lidR package is to blame for this bug that I'm seeing, so please let me know if I should post this issue elsewhere. Basically, I am trying to create a DEM that matches the resolution and extent of another SpatRaster object from the terra package. In my case, I am doing this using several tiles and would like to parallelize this process using the future package. The following example uses the example data from this package to replicate:

library(lidR)
library(future)
library(terra)

LASfile <- system.file("extdata", "Megaplot.laz", package="lidR")
ctg = readLAScatalog(LASfile, chunk_size = 200, chunk_buffer = 0)

opt_chunk_alignment(ctg) <- c(275, 90)
opt_output_files(ctg) <- paste0(tempdir(), "/retile_{XLEFT}_{YBOTTOM}")

newctg <- catalog_retile(ctg)
newctg <- readLAScatalog(list.files(tempdir(), pattern = "^retile_", full.names = TRUE))

# Create dummy grid to create DEM on
dummy_grid_terra <- rast(extent = ext(newctg), res = 5, crs = st_crs(newctg)$wkt)

# Create parallel instance using future
plan(multisession, workers = 2)

# Create DEM
dem <- rasterize_terrain(newctg, res = dummy_grid_terra)

# Error: NULL value passed as symbol address

# The above process works using the raster package though:
dummy_grid_raster <- raster::raster(dummy_grid_terra)
plan(multisession, workers = 2)
dem <- rasterize_terrain(newctg, res = dummy_grid_raster)

# It also works when done sequentially
plan(sequential)
dem <- rasterize_terrain(newctg, res = dummy_grid_terra)

At first I suspected that the error I am seeing was due to the dummy_grid_terra object being loaded into memory, but even when I initialize it with NaN values and write it to disk the error still exists, so that wasn't it. It's strange to me because it works fine with the older RasterLayer object. Is this the desired effect? Is there a way to go about this without using the raster package?

Thanks for the help!

Jean-Romain commented 1 year ago

SpatRaster are not serializable which mean that you can't send dummy_grid_terra to the two workers. I implemented some workaround on key location in the code of lidR but it seems I did not handle this case. It works with dummy_grid_raster because RasterLayer are serializable.

Jean-Romain commented 1 year ago

Fixed