r-lidar / lasR

Fast and Pipable Airborne LiDAR Data Tools
https://r-lidar.github.io/lasR/
GNU General Public License v3.0
53 stars 1 forks source link

Error raised during DTM derivation and LAZ normalization #73

Closed JohnKilbride closed 1 month ago

JohnKilbride commented 1 month ago

Hello,

I am encountering what looks like a C++ related error when computing a DTM and normalized LAZ files using lasR.

terminate called after throwing an instance of 'std::length_error'
  what():  vector::_M_default_append
Aborted

I am processing data on a Google Cloud Platform VM with

I am processing the following LiDAR acquisition collected by the USGS:

Acquisition details:

Some notes:

I created the following script that reproduces the with just the function in question:

library(lasR)

# Derive a Digital Terrain Model and a Normalized Point Cloud
run_lazr_norm_and_dtm <- function (laz_input_folder, dem_out_dir, norm_out_dir) 
{

  # Get the LAZ folder
  laz_files = list.files(path = laz_input_folder, pattern = "\\.laz$", full.names = TRUE)

  # Compose the lasR pipeline
  pipeline = lasR::reader_las() +
    dtm(1, ofile=paste0(dem_out_dir,"/*_dem.tif")) +
    lasR::normalize() +
    lasR::write_las(ofile =  paste0(norm_out_dir, "/*.laz")) 

  # Execute all steps in the pipeline
  exec(pipeline, on = laz_files, ncores = 8)

}

main <- function () {

  # Define path to the LAZ files
  norm_files = "/home/johnb/me_southcoastal_2_2020/src_laz"

  # Define the output folders
  dem_output_dir = "/home/johnb/dem_outputs"
  norm_output_dir = "/home/johnb/norm_outputs"

  # Make the output folders if the don't exist
  if (!dir.exists(dem_output_dir)) {
    dir.create(dem_output_dir)
  }
  if (!dir.exists(norm_output_dir)) {
    dir.create(norm_output_dir)
  }

  # Run the DTM creation
  run_lazr_norm_and_dtm(
    laz_input_folder = norm_files, 
    dem_out_dir = dem_output_dir, 
    norm_out_dir = norm_output_dir
  )

}

main()

During my test runs I had 72 and 80 tiles complete successfully before the crash.

Is there a way for lazR to simply "skip" a bad tile? Given the scope of the lidar process, dropping a few problematic tiles to ensure the workflow runs to completion would be fine.

I'm running some code at the moment to see if I can find the offending LAZ file that is causing the crash.

Jean-Romain commented 1 month ago

Thank you for reporting. Sorry for the crash after 60%. lasR is still a maturing software.

Dataset size: 26 Gb

26 GB on 127 files = 200 MB/file times 8 cores = 1.6GB. Times 4 because it is LAZ = 6-7 GB of loaded point cloud. Memory should not the issue. Anyway the error is an uncaught exception not related to memory

Is there a way for lazR to simply "skip" a bad tile?

Technically yes but for the moment I prefer it to crash. As I said it is still maturing. The bug must be fixed not the file skipped.

I'm running some code at the moment to see if I can find the offending LAZ file that is causing the crash.

Yes please. Use exec(..., verbose = TRUE). Do not set ncores = 8. I'd like to be able to reproduce without downloading 26 GB and without running 2 hours of computation. Downloading 1 files takes me several minutes.

Btw lasR intents to be the fastest software on the market. To be the fastest you must used the fastest pipeline. Here dtm() triangulate the ground point to make a Delaunay Triangulation and normalize() too. The triangulation of the ground point is computed twice. It is not super expensive so it is not a big deal but anyway the best way to write your pipeline is by computing the triangulation once and recycling it in two stages:

tin = triangulate(filter = keep_ground())
dtm = rasterize(1, tin)
norm = transform_with(dtm)
write  = write_las(ofile =  paste0(norm_out_dir, "/*.laz")) 

pipeline = tin + dtm + norm + write
Jean-Romain commented 1 month ago

Install the latest version 0.8.0 and test

run_lazr_norm_and_dtm <- function (laz_input_folder, dem_out_dir, norm_out_dir) 
{
  # Compose the lasR pipeline
  pipeline =
    lasR:::stop_if_chunk_id_below(70) +
    lasR::reader_las() +
    dtm(1, ofile=paste0(dem_out_dir,"/*_dem.tif")) +
    lasR::normalize() +
    lasR::write_las(ofile =  paste0(norm_out_dir, "/*.laz")) 

  # Execute all steps in the pipeline
  exec(pipeline, on = laz_input_folder, ncores = 1, verbose = TRUE)
}

This will skip the 70 first files. I used 1 cores so we will easily find which file is problematic.

JohnKilbride commented 1 month ago

Download link to the offending tile: LINK

I am using the following script:

library(lasR)

run_lazr_norm_and_dtm <- function (laz_files, dem_out_dir, norm_out_dir) 
{
  # Compose the lasR pipeline
  pipeline =
    lasR::reader_las() +
    dtm(1, ofile=paste0(dem_out_dir,"/*_dem.tif")) +
    lasR::normalize() +
    lasR::write_las(ofile =  paste0(norm_out_dir, "/*.laz")) 

  # Execute all steps in the pipeline
  exec(pipeline, on = laz_files, ncores = 1, verbose = TRUE)
}

main <- function() {

  # Get the LAZ files
  laz_files = c(
    "/home/johnb/me_southcoastal_2_2020/src_laz/USGS_LPC_ME_SouthCoastal_2020_A20_19TCH367773.laz"
    )

  # Define the output folders
  dem_output_dir = "/home/johnb/dem_outputs"
  norm_output_dir = "/home/johnb/norm_outputs"

  # Run the DTM creation
  run_lazr_norm_and_dtm(
    laz_file = laz_files, 
    dem_out_dir = dem_output_dir, 
    norm_out_dir = norm_output_dir
  )

}

main()

I get the following outputs in the terminal:

File processing options:
  Read points: true
  Streamable: false
  Buffer: 50.0
  Concurrent files: 1
  Concurrent points: 1
  Chunks: 1

Processing chunk 1/1 in thread 0: USGS_LPC_ME_SouthCoastal_2020_A20_19TCH367773
Stage: reader_las
 Number of point read 8750792
Stage: triangulate
Stage: rasterize
Stage: triangulate
Stage: transform_with

 *** caught segfault ***
address 0xa0, cause 'memory not mapped'

Traceback:
 1: exec(pipeline, on = laz_files, ncores = 1, verbose = TRUE)
 2: run_lazr_norm_and_dtm(laz_file = laz_files, dem_out_dir = dem_output_dir,     norm_out_dir = norm_output_dir)
 3: main()
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault

Final point of clarification (unrelated to the error): This function would perform the DTM derivation and normalizations efficiently as possible:

run_lazr_norm_and_dtm <- function (laz_files, dem_out_dir, norm_out_dir) 
{

  # Construct the pipeline  
  tin = triangulate(filter = keep_ground())
  dtm = rasterize(1, tin, ofile=paste0(dem_out_dir,"/*_dem.tif"))
  norm = transform_with(dtm)
  write  = write_las(ofile =  paste0(norm_out_dir, "/*.laz")) 
  pipeline = tin + dtm + norm + write

  # Execute all steps in the pipeline
  exec(pipeline, on = laz_files, ncores = NUM_CORES)

}
Jean-Romain commented 1 month ago

This was a bug already fixed in #25 for which I removed the error handling.

In your file you have 0 ground points. Thus, no triangulation, thus no interpolation. Crash! You code now throw an error

Error : in 'rasterize' while processing the point cloud: no Delaunay triangulation: there were fewer than 3 points during the 'triangulate' stage

In your case you have water point that can be used as an alternative. I changed dtm() and normalize() to use ground and water by default. So your code now works.

I added a function keep_ground_and_water() that you can use

run_lazr_norm_and_dtm <- function (laz_files, dem_out_dir, norm_out_dir) 
{
  # Construct the pipeline  
  tin = triangulate(filter = keep_ground_and_water())
  dtm = rasterize(1, tin, ofile=paste0(dem_out_dir,"/*_dem.tif"))
  norm = transform_with(dtm)
  write  = write_las(ofile =  paste0(norm_out_dir, "/*.laz")) 
  pipeline = tin + dtm + norm + write

  # Execute all steps in the pipeline
  exec(pipeline, on = laz_files, ncores = NUM_CORES)
}

Now without ground nor water I don't know how to prevent the pipeline to stop. The point cloud will not be normalizable at all.

JohnKilbride commented 1 month ago

Thanks a lot man!

USGS should always have ground points classified as part of their standards. However, it simply didn't occur to me that there might be tiles where there are no ground points. In hindsight, that should have been obvious as there are many coastal acquisitions.

I will add some logic to my workflow to ensure that ground points or water points are present.

Thanks for addressing that so rapidly!

Jean-Romain commented 1 month ago

Actually, it is relatively common to have tile coverage limited to water, especially here in Canada. The case was handled, but I commented on the code... :disappointed:

FYI, lidR and lasR will soon no longer be supported by my university. To continue developing lasR and lidR sustainably, I am going to become self-employed and offer development services, counseling, and training courses for laser data processing. Since it seems you are working for a company, I wanted to share this information in case you might be interested. The software will remain free and opensource.

JohnKilbride commented 1 month ago

I'll certainly mention it to the folks in industry I know! Do you have a LinkedIn account -- that seems to be the go-to for recruiters/networking these days (...for the better or worse).

Jean-Romain commented 1 month ago

Not yet but I'm considering it. I'm currently working on a website. You can contact me at info@r-lidar.com for professional exchange.