ptompalski / LidarSpeedTests

Evaluating the performance of various point cloud processing tools.
https://ptompalski.github.io/LidarSpeedTests/
GNU General Public License v3.0
17 stars 0 forks source link

comparison to PDAL #3

Open wiesehahn opened 2 weeks ago

wiesehahn commented 2 weeks ago

I thought I saw this mentioned somewhere but now I can't find it anymore, forgive me if I have missed it.

Are there any plans to also include PDAL in the comparison?

I have to admit that I barely used it so far, but I think it's a natural "rival" and as such it would be helpful to have some measures for comparison. I think PDAL is much more versatile but if lasr would be faster on certain tasks this would also be a selling point.

Jean-Romain commented 2 weeks ago

We need someone with expertise in PDAL. I've conducted some initial tests, but the performance difference was so significant that I'm questioning whether my PDAL build was properly compiled in release mode.

From what I understand and according to the official documentation, PDAL is not directly comparable with lasR. PDAL appears to handle only a single LAS file at a time, requiring custom loops for processing multiple files. Additionally, PDAL does not offer built-in buffer management or parallel processing capabilities—you would need to implement these features manually. These are two of the major differences between PDAL and lasR. PDAL supports more I/O formats and data sources: this is it main goal and strength.

Below are the results from my tests with 4 LAZ files. The tests using lidR, lasR, and LAStools include buffer management, while the PDAL test involves a loop over files with no buffer management.

:warning: The Y-axis is in MB/s—higher values indicate better performance.

Please interpret the results cautiously as this was a quick test. I followed the compilation directives from the PDAL documentation, but the performance was unexpectedly slow, suggesting the build might have been in debug mode or misconfigured in some way. I can't believe that PDAL is really that slow. However PDAL is known to be slower than LAStools and lasR is faster than LAStools. By transtivity, lasR is faster than PDAL for sure. In my tests it was something like 20 to 100 times faster. To be confirmed.

normalize

rplot05_480

chm

rplot

wiesehahn commented 2 weeks ago

Year PDAL states it's single threaded and runs on a single file. You might want to try https://github.com/PDAL/wrench this is probably closer to lasR

Jean-Romain commented 2 weeks ago

I have previously used wrench when implementing VPC support to better understand the format. Aside from that, all of wrench's tools are available in lasR, but since wrench is a command-line tool without pipelines, it is more comparable to LAStools in my opinion. For a basic CHM, I could potentially use wrench for benchmarking against lasR.

However, I currently don't have the time or inclination to create a comprehensive comparison of all available software on the market. Users are encouraged to install, test, and review the documentation of the various tools to determine which best meets their needs. This is why I haven’t made any direct comparisons with other software. Piotr is conducting his own benchmarks independently, and I am not involved in that process beyond a few advises on the best way to write the pipelines.

wiesehahn commented 2 weeks ago

Thanks for the quick response. Thats actually why I asked here as I think it might be better to have an independent test. But of course your opinion and experiences are highly appreciated. I am totally convinced of lidR and lasRs capabilities but as an R user I might be biased.

I would like to spread the word about lasR as it worked quite well for the things I tested and I feel that it is not as well known as it should be. Also I really appreciate your expertise, communication and dedication and I would like to support that by making the software more known. But as you say there are so many software solutions and it would be really helpful to have some benchmarks as selling points, otherwise it will be hard to convince people to use it over a similar but more vertasile and settled software.

Jean-Romain commented 2 weeks ago

Thank you for your kind words. lasR is new and still maturing. It tooks years for lidR it will take some years for lasR as well. For your information I'm (among other projects) working on a Python porting. This is likely the next biggest incoming change.

What makes you think PDAL is more versatile? Serious question. Currently it has more stages and support more formats for sure but I would not call it more versatile. On its core lasR can build more complex and optimized pipelines in an easier way in my opinion. I'm not an PDAL expert but I read the documentation and I think lasR have enough differences to be more or less versatile depending on the specific usages. This place is not the best one for that but if you want to discuss, you know where the discussion tab is in lasR :wink:

wiesehahn commented 2 weeks ago

What makes you think PDAL is more versatile? Serious question. Currently it has more stages and support more formats

My experience with PDAL is minimal, but I was exactly referring to this, plus I simply assumed that it has to be, since the code base and comit history is magnitudes larger.

But even if they were on par, there has to be some dedicated advantages in comparison to other software to have a "selling point". In comparison to lidR you mention it on e.g. on the website. In comparison to LASTools it might be that it's FOSS, ...

ptompalski commented 2 weeks ago

thanks for the reminder @wiesehahn and @Jean-Romain - I think we discussed this somewhere before and the plan is to include pdal in the comparison. I have used once or twice before and need to first figure out how to use it so that the comparison make sense. Based on what you @Jean-Romain wrote, it may not be as simple.

I will keep you posted but if you already have a bit of code I could run (so that I don't start from scratch) would be great if you could share.

PS. I am working on putting the entire benchmarking code here in this repo, with detailed instructions so that anyone could reproduce everything. There was a small issue with the dataset I used and I need to sort that out first, make the data available, etc.

Jean-Romain commented 1 week ago

DSM with PDAL & lasR on 4 tiny files shipped with lasR. Full reproducible example. Basic loop on files for PDAL with no buffer (no need of any buffer here anyway but important for DTM)

The difference should drastically increase with more and bigger files.

@ptompalski if you can run and automate tests with a similar code I can write the pdal pipeline for each test.

library(jsonlite)

# Define a function to run the PDAL pipeline for a given input LAS file
run_pdal_dsm <- function(input_file, output_file) 
{
  pipeline <- list(
    pipeline = list(
      list(
        type = "readers.las",
        filename = input_file),
      list(
        type = "writers.gdal",
        filename = output_file,
        resolution = 1.0,
        output_type = "max"
      )
    )
  )

  # Write the pipeline to a temporary JSON file
  pipeline_file <- tempfile(fileext = ".json")
  write_json(pipeline, pipeline_file, auto_unbox = TRUE, pretty = TRUE)

  # Run the PDAL pipeline
  system(paste("pdal pipeline", pipeline_file))

  # Remove the temporary pipeline file
  file.remove(pipeline_file)
}

input_dir <- system.file("extdata", "bcts", package = "lasR")
output_dir <- tempdir()

las_files <- list.files(input_dir, pattern = "\\.la[s|z]$", full.names = TRUE)

ti = Sys.time()
output_files = vector("list", length(las_files))
for (i in seq_along(las_files)) 
{
  las_file = las_files[i]
  print(las_file)

  output_file <- file.path(output_dir, paste0(tools::file_path_sans_ext(basename(las_file)), "_dsm.tif"))
  output_files[[i]] = output_file 

  run_pdal_dsm(las_file, output_file)
}
tf = Sys.time()
difftime(tf, ti)

ti = Sys.time()
lasR_dsm = exec(lasR::rasterize(1, "max"), on = input_dir, ncores = 4)
tf = Sys.time()
difftime(tf, ti)

library(terra)
pdal_dsm = vrt(unlist(output_files))
plot(pdal_dsm)

plot(lasR_dsm)