r-lidar / lidR

Airborne LiDAR data manipulation and visualisation for forestry application
https://CRAN.R-project.org/package=lidR
GNU General Public License v3.0
588 stars 132 forks source link

grid_metrics fails if opt_output_files is set #320

Closed jgrn307 closed 4 years ago

jgrn307 commented 4 years ago

Input files to this

So perhaps I'm missing something, but when running this function:

library(lidR)

c2c_norm_dir <- "[inputfiles_from_google_drive]"
c2c_norm_ctg <- readLAScatalog(c2c_norm_dir,progress=T)
lidR:::catalog_laxindex(c2c_norm_ctg)

opt_chunk_size(c2c_norm_ctg) <- 500
opt_output_files(c2c_norm_ctg) <- 
  "/data/gpfs/assoc/gears/lidarchange/analysis/c2c_change/ferguson/grid_metrics_error_output"

lossmetricsf <- function(Z,Classification,c2c_dist)
{
  # browser()
  c2c_threshold=0.5

  ground_loss_total=as.numeric(sum((c2c_dist>=c2c_threshold) & (Classification==2L)))
  surface_loss_total=as.numeric(sum((c2c_dist>=c2c_threshold) & (Classification!=2L) & (Z<=2)))
  canopy_loss_total=as.numeric(sum((c2c_dist>=c2c_threshold) & (Classification!=2L) & (Z>2)))
  ground_loss_fraction=as.numeric(ground_loss_total/sum(Classification==2L))
  surface_loss_fraction=as.numeric(surface_loss_total/sum((Classification!=2L) & (Z<=2)))
  canopy_loss_fraction=as.numeric(canopy_loss_total/sum((Classification!=2L) & (Z>2)))

  #  print(c(ground_loss_total))
  metrics = list(
    ground_loss_fraction=ground_loss_total,
    surface_loss_total=surface_loss_total,
    canopy_loss_total=canopy_loss_total#,
    #    ground_loss_fraction=ground_loss_fraction,
    #    surface_loss_fraction=surface_loss_fraction,
    #    canopy_loss_fraction=canopy_loss_fraction
  )
  return(metrics)
}

lossmetrics=grid_metrics(c2c_norm_ctg,
                         ~lossmetricsf(Z,Classification,c2c_dist),
                         res=10)

I get a total R failure (it crashes R). If you DISABLE opt_output_files, this works fine. I'd like to control where the grid_metric output goes -- 1) is this not the right way, and 2) why is this crashing?

Jean-Romain commented 4 years ago

As expected we have an error because you mis-used the opt_output_files option. But I was not able to reproduce a total crashes.

This is the error I get

Processing [===================>---------------------------]  50% (2/4) eta:  2s
An error occurred when processing the chunk 2. Try to load this chunk with:
 chunk <- readRDS("/tmp/RtmpAlftwa/chunk2.rds")
 las <- readLAS(chunk)
filename exists; use overwrite=TRUE
ERROR 4: ~/Téléchargements/issue320/grid_metrics_error_output.tif: No such file or directory
Warning 1: Can't open ~/Téléchargements/issue320/grid_metrics_error_output.tif. Skipping it
Warning message:
In system(cmd, intern = TRUE) :
  l'exécution de la commande '"/usr/bin/gdalbuildvrt" "/home/jr/Téléchargements/issue320/grid_metrics.vrt" "~/Téléchargements/issue320/grid_metrics_error_output.tif"' renvoie un statut 1

The error filename exists; use overwrite=TRUE is straighfoward. You tried to name all your raster output file with the same name grid_metrics_error_output.tiff. You have a collision at the second file. You must use a templated filename such as

opt_output_files(c2c_norm_ctg) <-  "..../ferguson/grid_metrics_error_{ID}"

Actually I assumed you used the option properly but you made a mistake when filling the bug report. So I tried with a valid template opt_output_files(c2c_norm_ctg) <- "~/Téléchargements/issue320/test/file{ID}". Again I was not able to reproduce a crash but I got a failure.

Processing [===================================================================================================================================] 100% (4/4) eta:  0s
ERROR 4: ~/Téléchargements/issue320/grid_metrics_error_output/file1.tif: No such file or directory
Warning 1: Can't open ~/Téléchargements/issue320/grid_metrics_error_output/file1.tif. Skipping it
ERROR 4: ~/Téléchargements/issue320/grid_metrics_error_output/file2.tif: No such file or directory
Warning 1: Can't open ~/Téléchargements/issue320/grid_metrics_error_output/file2.tif. Skipping it
ERROR 4: ~/Téléchargements/issue320/grid_metrics_error_output/file3.tif: No such file or directory
Warning 1: Can't open ~/Téléchargements/issue320/grid_metrics_error_output/file3.tif. Skipping it
ERROR 4: ~/Téléchargements/issue320/grid_metrics_error_output/file4.tif: No such file or directory
Warning 1: Can't open ~/Téléchargements/issue320/grid_metrics_error_output/file4.tif. Skipping it
Warning message:
In system(cmd, intern = TRUE) :
  l'exécution de la commande '"/usr/bin/gdalbuildvrt" "/home/jr/Téléchargements/issue320/grid_metrics_error_output/grid_metrics.vrt" "~/Téléchargements/issue320/grid_metrics_error_output/file1.tif" "~/Téléchargements/issue320/grid_metrics_error_output/file2.tif" "~/Téléchargements/issue320/grid_metrics_error_output/file3.tif" "~/Téléchargements/issue320/grid_metrics_error_output/file4.tif"' renvoie un statut 1

This one is clear to me ~/ has not been understood and this need to be enhanced.

Then I used the full path opt_output_files(c2c_norm_ctg) <- "/home/jr/Téléchargements/issue320/test/file{ID}" and it worked fine.

Rplot

So far, using your example I only detected an error in your code + an small enhancement I could make in the package. I have not been able to reproduce a crash. I keep this open if you can give me more insight. Please mention you R version, your lidR version, your rlas version, your operating system and everything else that could help me to reproduce.

jgrn307 commented 4 years ago

So its R 3.6.2 running within an Ubuntu singularity on an HPC. I did tweak the {ID} thing:

opt_output_files(c2c_norm_ctg) <- "/data/gpfs/assoc/gears/lidarchange/analysis/c2c_change/walker/fuel_loss_tiles/walker_fuel_loss_10m_{ID}"

but here's the crash:

> library(future)
> plan(multisession, workers=64L)
>
> lossmetrics=grid_metrics(c2c_norm_ctg,
+                          ~lossmetricsf(Z,Classification,c2c_dist),
+                          res=5)
  |                                                                      |   0%free(): invalid pointer
Error in unserialize(node$con) :
  Failed to retrieve the value of MultisessionFuture (<none>) from cluster SOCKnode #1 (PID 75968 on localhost ‘localhost’). The reason reported was ‘error reading from connection’. Post-mortem diagnostic: No process exists with this PID, i.e. the localhost worker is no longer alive.
> munmap_chunk(): invalid pointer
double free or corruption (out)

Without multisession I get:

> plan(sequential)
> lossmetrics=grid_metrics(c2c_norm_ctg,
+                          ~lossmetricsf(Z,Classification,c2c_dist),
+                          res=5)
  |                                                                      |   0%WARNING: files have different attribute 1
WARNING: files have different attribute 1
double free or corruption (out)

Aborted

(and the full crash)

At this point, the ONLY way I've gotten this to work is to NOT set the opt_output_files but that tries to load the raster into memory, but if I:

rasterOptions(todisk=T)

I can grab the r_tmp.grd/gri after the process is done and save them to where I need to go.

Jean-Romain commented 4 years ago

I also have the WARNING. I know where does it come from. You added two attributes in your files with lasaddextrabytes. The attributes are not strictly identical in all files but in way that is actually fair. LASlib is a little bit zealous on that points. Anyway I fixed this in the version 2.2.3. So the warning it is not important and it does not crash R (on my computer). However double free or corruption (out) actually corresponds to the crash. We need to track where does it come from.

Try to trigger the warning with the following and tell me if it crashes

las1 = readLAS(c2c_norm_ctg$filename[1:2])

But your trouble seem to be actually related to the written files. So I assume the above code won't fail.

I think you can reproduce the bug with

c2c_norm_ctg <- readLAScatalog(c2c_norm_dir,progress=T)

opt_chunk_size(c2c_norm_ctg) <- 500
opt_output_files(c2c_norm_ctg) <- "/home/jr/Téléchargements/issue320/test/file{ID}"

cls = lidR:::catalog_makecluster(c2c_norm_ctg)
las = readLAS(cls[[1]])
lossmetrics=grid_metrics(las, ~lossmetricsf(Z,Classification,c2c_dist), res=10)
u = lidR:::writeANY(lossmetrics, tempfile(), c2c_norm_ctg@output_options$drivers)
u

This is more or less what happen under the hood.

jgrn307 commented 4 years ago

Yep, that last step is what kills it:

u = lidR:::writeANY(lossmetrics, tempfile(), c2c_norm_ctg@output_options$drivers)

 *** caught segfault ***
address 0x98, cause 'memory not mapped'

Traceback:
 1: .gd_SetProject(object, ...)
 2: rgdal::GDALcall(transient, "SetProject", as.character(projection(r)))
 3: .getGDALtransient(x, filename = filename, options = options,     ...)
 4: .startGDALwriting(y, filename, options, setStatistics = setStatistics,     ...)
 5: .writeGDALall(x, filename = filename, format = filetype, ...)
 6: .local(x, filename, ...)
  7: (new("standardGeneric", .Data = function (x, filename, ...) standardGeneric("writeRaster"), generic = "writeRaster", package = "raster",     group = list(), valueClass = character
(0), signature = c("x",     "filename"), default = NULL, skeleton = (function (x, filename,         ...)     stop("invalid call in method dispatch to 'writeRaster' (no default method)
",         domain = NA))(x, filename, ...)))(format = "GTiff", NAflag = -999999,     x = new("RasterBrick", file = new(".RasterFile", name = "",         datanotation = "FLT4S", byteor
der = "little", nodatavalue = -Inf,         NAchanged = FALSE, nbands = 1L, bandorder = "BIL", offset = 0L,         toptobottom = TRUE, blockrows = 0L, blockcols = 0L, driver = "",
      open = FALSE), data = new(".MultipleRasterData", values = c(92,     147, 83, 145, 131, 189, 104, 148, 100, 121, 121, 154, 93,     164, 170, 172, 45, 216, 91, 122, 136, 130, 195,
 154, 137,     175, 117, 114, 140, 73, 182, 150, 155, 62, 140, 146, 103,     135, 117, 131, 137, 51, 95, 160, 96, 116, 111, 121, 88, 93,     128, 84, 168, 72, 117, 120, 107, 92, 76, 9
5, 106, 194, 113,     176, 74, 130, 116, 142, 127, 118, 199, 186, 170, 263, 131,     235, 106, 120, 148, 114, 111, 102, 110, 143, 156, 79, 171,     148, 135, 178, 178, 134, 134, 76, 1
46, 132, 109, 164, 99,     97, 88, 112, 100, 95, 127, 136, 87, 155, 86, 122, 104, 140,     144, 198, 115, 39, 125, 94, 149, 54, 49, 154, 220, 207, 236,     191, 137, 186, 229, 132, 13

Lots more numbers...

1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NaN, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1,
1, 1, 1, 1,     1, 1, 1, 1, 1, 1, NaN, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA), offset = 0,         gain = 1, inmemory = TRUE, fromdisk = FALSE, nlayers = 6L,         dropped = NULL, isfactor = c(ground_loss_fraction = FALSE,         surface_loss_total = FAL
SE, canopy_loss_total = FALSE,         ground_loss_fraction = FALSE, surface_loss_fraction = FALSE,         canopy_loss_fraction = FALSE), attributes = list(), haveminmax = TRUE,
    min = c(0, 0, 0, 0, 0, 0), max = c(433, 328, 1199, 1,         1, 1), unit = "", names = c("ground_loss_fraction", "surface_loss_total",         "canopy_loss_total", "ground_loss_f
raction", "surface_loss_fraction",         "canopy_loss_fraction")), legend = new(".RasterLegend",         type = character(0), values = logical(0), color = logical(0),         names
= logical(0), colortable = logical(0)), title = character(0),         extent = new("Extent", xmin = -54000, xmax = -53470,             ymin = 221000, ymax = 221530), rotated = FALSE,
rotation = new(".Rotation",             geotrans = numeric(0), transfun = function ()             NULL), ncols = 53L, nrows = 53L, crs = new("CRS",             projargs = "+proj=aea +
lat_1=34 +lat_2=40.5 +lat_0=0 +lon_0=-120 +x_0=0 +y_0=-4000000 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"),         history = list(), z = list()), filename = "/data/g
pfs/assoc/gears/scratch/jgreenberg/Rtmpb2HVUl/file23a724ada73d4.tif")
 8: (new("standardGeneric", .Data = function (x, filename, ...) standardGeneric("writeRaster"), generic = "writeRaster", package = "raster",     group = list(), valueClass = character
(0), signature = c("x",     "filename"), default = NULL, skeleton = (function (x, filename,         ...)     stop("invalid call in method dispatch to 'writeRaster' (no default method)
",         domain = NA))(x, filename, ...)))(format = "GTiff", NAflag = -999999,     x = new("RasterBrick", file = new(".RasterFile", name = "",         datanotation = "FLT4S", byteor
der = "little", nodatavalue = -Inf,         NAchanged = FALSE, nbands = 1L, bandorder = "BIL", offset = 0L,         toptobottom = TRUE, blockrows = 0L, blockcols = 0L, driver = "",
      open = FALSE), data = new(".MultipleRasterData", values = c(92,     147, 83, 145, 131, 189, 104, 148, 100, 121, 121, 154, 93,     164, 170, 172, 45, 216, 91, 122, 136, 130, 195,
 154, 137,     175, 117, 114, 140, 73, 182, 150, 155, 62, 140, 146, 103,     135, 117, 131, 137, 51, 95, 160, 96, 116, 111, 121, 88, 93,     128, 84, 168, 72, 117, 120, 107, 92, 76, 9

Still more numbers...

0, 0, 0, 0.56140350877193,     0.909090909090909, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NaN, 1, 1, 1, 1,     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0, 0,     0, 0, 0.391304347826087, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, NaN, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,     0.412698412698413, NaN, NaN, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, NaN, NA, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, 1, 1, 1, 0.142857142857143, 0.0289855072463768,     0, 0.827586206896552, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, NaN, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,     1, 1, 1, 1, 0.981481481481482, 0.928571428571429, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NaN, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, NaN, 1, 1, 1, 1, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, NaN, 1, 1, 1, 1, 1, NA, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NA, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, NaN, 1, 1, 1, 1, NaN, NA, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, NaN, 1, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NaN, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, NaN, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NA, NA, NA, NA, NA, NA,     NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), offset = 0,         gain = 1, inmemory = TRUE, fromdisk = FALSE, nlayers = 6L,         dropped = NULL, isfactor = c(ground_loss_fraction = FALSE,         surface_loss_total = FALSE, canopy_loss_total = FALSE,         ground_loss_fraction = FALSE, surface_loss_fraction = FALSE,         canopy_loss_fraction = FALSE), attributes = list(), haveminmax = TRUE,         min = c(0, 0, 0, 0, 0, 0), max = c(433, 328, 1199, 1,         1, 1), unit = "", names = c("ground_loss_fraction", "surface_loss_total",         "canopy_loss_total", "ground_loss_fraction", "surface_loss_fraction",         "canopy_loss_fraction")), legend = new(".RasterLegend",         type = character(0), values = logical(0), color = logical(0),         names = logical(0), colortable = logical(0)), title = character(0),         extent = new("Extent", xmin = -54000, xmax = -53470,             ymin = 221000, ymax = 221530), rotated = FALSE, rotation = new(".Rotation",             geotrans = numeric(0), transfun = function ()             NULL), ncols = 53L, nrows = 53L, crs = new("CRS",             projargs = "+proj=aea +lat_1=34 +lat_2=40.5 +lat_0=0 +lon_0=-120 +x_0=0 +y_0=-4000000 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"),         history = list(), z = list()), filename = "/data/gpfs/assoc/gears/scratch/jgreenberg/Rtmpb2HVUl/file23a724ada73d4.tif")
 9: do.call(driver$write, driver$param)
10: lidR:::writeANY(lossmetrics, tempfile(), c2c_norm_ctg@output_options$drivers)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:
jgrn307 commented 4 years ago

Sorry, forgot my sessionInfo:

sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.6.2
Jean-Romain commented 4 years ago

So it does not come from lidR. At least not directly. First update rgdal and raster if needed. Also ensure that libgdal is also up to date. Maybe this bug has already been fixed. This is why I cannot reproduce on my Linux Mint.

If the bug persists, clear every irrelevant lidR stuff to make a minimal reproducible example. I think this should fail:

cls = lidR:::catalog_makecluster(c2c_norm_ctg)
las = readLAS(cls[[1]])
lossmetrics=grid_metrics(las, ~lossmetricsf(Z,Classification,c2c_dist), res=10)
raster::writeRaster(lossmetrics, tempfile(fileext = ".tif"), format = "GTiff", NAflag = -999999)

Then you can make it even more minimal by saving lossmetrics into a rds with saveRDS so it becomes

lossmetrics = readRDS(...)
raster::writeRaster(lossmetrics, tempfile(fileext = ".tif"), format = "GTiff", NAflag = -999999)

Then share the .rds file. I'll try to figure out if something is wrong in the raster that could come from lidR. But I'm pretty sure it is something that need to be reported to rgdal team.

Also try to make it yet more minimal with

raster::writeRaster(lossmetrics, tempfile(fileext = ".tif"))

To see if NAflag is somehow related to the issue.