efpl-columbia / PointClouds.jl

Fast & flexible processing of lidar point-cloud data
https://docs.mfsch.dev/PointClouds.jl
MIT License
2 stars 0 forks source link

Performance of read not optimal #1

Closed SimonDanisch closed 5 days ago

SimonDanisch commented 1 month ago

I just did some benchmarking of Julia's LAS readers, since loading my pointclouds takes pretty long. I did find, that PointClouds.jl is slower then LASDatasets:

PointClouds.jl (less allocations because they happen in C?)

5.822 s (53855862 allocations: 945.05 MiB)

LASDatasets.jl

4.083 s (32316360 allocations: 1.53 GiB)

My LASDatasets.jl PR (https://github.com/fugro-oss/LASDatasets.jl/pull/38)

3.724 s (10773706 allocations: 1.05 GiB)

Maybe worth to use LASDatasets internally? Could also help having the reader in Julia, to do transformations while loading the data.

I really like the little tile server and processing in here, so would be nice to have the optimal performance, since displaying tiles becomes really slow with every second spend on reading!

Btw, this is for adding pointcloud tiles to https://github.com/MakieOrg/Tyler.jl:

https://github.com/user-attachments/assets/c588c94a-d75e-44f8-b6c0-c3f3996102be

Btw, how hard would be to add e.g. https://geotiles.citg.tudelft.nl/ as a tile provider?

mfsch commented 1 month ago

Hi, thanks for checking out PointClouds, and Tyler also looks interesting!

I guess you’re referring to loading LAS files, not LAZ? These are handled differently in PointClouds, because the compressed point data of LAZ files is read with the external LASzip library. The uncompressed LAS files are handled completely in Julia unless you specify read_points = :laszip (undocumented) in the keyword arguments when loading a file.

So far I haven’t spent much time optimizing the loading performance and focused on correctness instead, so there are likely quite a few low-hanging fruits. I’ll try to spend some time on this over the next days. Parsing the point data is relatively simple, so I think eventually this should be bottlenecked by the disk speed rather than the CPU.

I guess for Tyler you’re mostly interested in loading the coordinates and the RGB data (or intensity when there’s no RGB)? For LAZ files, this could potentially be sped up quite a bit, because there the different point properties are compressed separately, so you only need to decompress the ones you actually need. The LASzip library has (partial) support for that, but I’m not yet sure about the best interface for that functionality.

I’ve already looked a bit into adding the lidar data of the Netherlands, and that’s definitely on the roadmap. From what I’ve seen, they don’t provide a web service for querying the database using coordinates, so we’d have to implement the tile lookup ourselves. Generally I’d prefer to directly load data from the governmental services rather than a downstream project, but if such a project is robust and has some benefits over the original data it might make sense to have both, especially if the non-official data is easy/easier to add.

SimonDanisch commented 1 month ago

I see. Indeed, converting to las and then loading it makes PointClouds.jl faster:


# LASDatasets.jl
@time LASDatasets.load_pointcloud(path).position;
5.805360 seconds (109.51 M allocations: 5.134 GiB, 52.98% gc time)

# PointClouds.jl
@time begin 
    las = LAS(path)
    map(x-> Point3f(x.coords), las)
end;
 2.016516 seconds

Would it make sense to recombine the reader/writer into e.g. LasIO.jl from PointClouds + LASDatasets?

I feel like an IO library should really just parse the format and do nothing on top - and it seems wasteful to have two (3?) implementations of the same reader/writer functionality.

Generally I’d prefer to directly load data from the governmental services

Do they exist for https://geotiles.citg.tudelft.nl ? It does seem like the most official source to me, but I'm only getting started with this.

mfsch commented 6 days ago

I finally found some time to clean up the work I did on performance improvements and a major refactoring of the code for loading and accessing point attributes (currently in the dev branch). I still have to clean up some of the docs, but the code should be fully functional.

I believe that the read performance is pretty close to optimal now, and I think we can close this issue as resolved once the there is a released version with this new code. There should be no per-point allocations left since I got rid of some remaining type-instabilities. For memory-mapped data, accessing individual attributes of points (e.g. with coordinates or intensity) gets an additional speed-up since those functions now selectively read the bytes with the data they require rather than constructing a point and accessing its fields. For compressed LAZ files, we cannot do the same because the LASzip library we’re relying on does not provide such fine-grained control. It has some limited functionality to do partial field access when reading, but to get the full benefit we’d have to port the decompression to Julia I fear. Maybe that’ll be a weekend project some day…

For both LAS & LAZ files, we’re in the same ballpark as the LASzip CLI tool now. For LAZ files, reading all points is consistently about 2× faster than before, while the speed-up for LAS files is more variable (minimal for small files, drastic for large files, from what I can tell). Below are a few benchmark runs with the new benchmarking code I’ve added plus running the LASzip CLI tool. The autzen files are from the test data in the PDAL repo, the other ones are from here (manually converted to LAS).

Commit efff951 before refactoring

» just benchmark --seconds=10 las/autzen_trim.las laz/autzen_trim.laz 6413c497d34eb496d1ce956e.las 6413c497d34eb496d1ce956e.laz

========== Results for `las/autzen_trim.las` ==========
intensity => min 3.26 ms, median 3.48 ms, mean 3.81 ms, max 10.9 ms, allocs 110002
random    => min 58.0 ns, median 159. ns, mean 154. ns, max 1.83 μs, allocs 1
iterate   => min 2.70 ms, median 2.92 ms, mean 3.24 ms, max 15.2 ms, allocs 110000
collect   => min 4.07 ms, median 4.41 ms, mean 5.38 ms, max 15.1 ms, allocs 220002
load      => min 31.5 μs, median 34.5 μs, mean 42.5 μs, max 23.5 ms, allocs 175
coords    => min 3.11 ms, median 3.37 ms, mean 3.86 ms, max 10.0 ms, allocs 110002
readpts   => min 65.7 ms, median 71.1 ms, mean 71.0 ms, max 75.6 ms, allocs 1428120

========== Results for `laz/autzen_trim.laz` ==========
intensity => min 100. ms, median 104. ms, mean 106. ms, max 118. ms, allocs 547957
random    => min 46.9 μs, median 11.6 ms, mean 12.7 ms, max 61.7 ms, allocs 3
iterate   => min 111. ms, median 112. ms, mean 114. ms, max 146. ms, allocs 547955
collect   => min 103. ms, median 117. ms, mean 117. ms, max 130. ms, allocs 657957
load      => min 37.9 μs, median 44.3 μs, mean 63.6 μs, max 57.2 ms, allocs 185
coords    => min 105. ms, median 116. ms, mean 115. ms, max 125. ms, allocs 547957
readpts   => min 101. ms, median 113. ms, mean 144. ms, max 342. ms, allocs 548140

========== Results for `6413c497d34eb496d1ce956e.las` ==========
intensity => min 760. ms, median 892. ms, mean 935. ms, max 1.19 s, allocs 16107900
random    => min 70.0 ns, median 257. ns, mean 253. ns, max 3.38 μs, allocs 1
iterate   => min 453. ms, median 465. ms, mean 465. ms, max 476. ms, allocs 16107898
collect   => min 2.14 s, median 2.88 s, mean 2.81 s, max 3.40 s, allocs 32215798
load      => min 29.0 μs, median 31.2 μs, mean 37.0 μs, max 29.6 ms, allocs 141
coords    => min 837. ms, median 910. ms, mean 940. ms, max 1.19 s, allocs 16107900
readpts   => min 9.93 s, median 9.93 s, mean 9.93 s, max 9.93 s, allocs 161077066

========== Results for `6413c497d34eb496d1ce956e.laz` ==========
intensity => min 13.6 s, median 13.6 s, mean 13.6 s, max 13.6 s, allocs 80537447
random    => min 155. μs, median 10.7 ms, mean 10.6 ms, max 23.6 ms, allocs 3
iterate   => min 15.2 s, median 15.2 s, mean 15.2 s, max 15.2 s, allocs 80537445
collect   => min 17.5 s, median 17.5 s, mean 17.5 s, max 17.5 s, allocs 96645345
load      => min 38.5 μs, median 45.9 μs, mean 60.5 μs, max 62.5 ms, allocs 151
coords    => min 15.9 s, median 15.9 s, mean 15.9 s, max 15.9 s, allocs 80537447
readpts   => min 15.2 s, median 15.2 s, mean 15.2 s, max 15.2 s, allocs 80537596

Commit 91aa22c after refactoring

» just benchmark --seconds=10 las/autzen_trim.las laz/autzen_trim.laz 6413c497d34eb496d1ce956e.las 6413c497d34eb496d1ce956e.laz

========== Results for `las-autzen_trim.las` ==========
intensity => min 450. μs, median 475. μs, mean 499. μs, max 5.77 ms, allocs 2
random    => min 57.0 ns, median 152. ns, mean 153. ns, max 17.9 μs, allocs 0
iterate   => min 2.78 ms, median 2.91 ms, mean 2.95 ms, max 6.48 ms, allocs 0
collect   => min 3.40 ms, median 3.57 ms, mean 3.77 ms, max 9.01 ms, allocs 2
load      => min 30.6 μs, median 32.9 μs, mean 39.8 μs, max 24.2 ms, allocs 167
coords    => min 611. μs, median 657. μs, mean 678. μs, max 5.83 ms, allocs 2
readpts   => min 3.64 ms, median 3.86 ms, mean 4.08 ms, max 10.5 ms, allocs 169

========== Results for `laz-autzen_trim.laz` ==========
intensity => min 57.4 ms, median 63.3 ms, mean 63.2 ms, max 74.0 ms, allocs 2
random    => min 92.7 μs, median 10.2 ms, mean 11.3 ms, max 30.1 ms, allocs 0
iterate   => min 57.9 ms, median 64.2 ms, mean 65.2 ms, max 101. ms, allocs 0
collect   => min 65.3 ms, median 68.0 ms, mean 72.4 ms, max 138. ms, allocs 2
load      => min 46.3 μs, median 49.7 μs, mean 61.6 μs, max 36.3 ms, allocs 185
coords    => min 56.6 ms, median 61.5 ms, mean 61.9 ms, max 68.1 ms, allocs 2
readpts   => min 57.2 ms, median 61.7 ms, mean 62.0 ms, max 70.0 ms, allocs 187

========== Results for `6413c497d34eb496d1ce956e.las` ==========
intensity => min 107. ms, median 124. ms, mean 130. ms, max 305. ms, allocs 2
random    => min 62.0 ns, median 247. ns, mean 246. ns, max 18.5 μs, allocs 0
iterate   => min 362. ms, median 414. ms, mean 412. ms, max 465. ms, allocs 0
collect   => min 623. ms, median 701. ms, mean 823. ms, max 1.75 s, allocs 2
load      => min 29.6 μs, median 35.1 μs, mean 38.6 μs, max 24.3 ms, allocs 133
coords    => min 227. ms, median 276. ms, mean 271. ms, max 455. ms, allocs 2
readpts   => min 643. ms, median 689. ms, mean 708. ms, max 874. ms, allocs 135

========== Results for `6413c497d34eb496d1ce956e.laz` ==========
intensity => min 6.81 s, median 6.81 s, mean 6.81 s, max 6.81 s, allocs 2
random    => min 180. μs, median 10.5 ms, mean 10.7 ms, max 35.8 ms, allocs 0
iterate   => min 7.71 s, median 7.71 s, mean 7.71 s, max 7.71 s, allocs 0
collect   => min 7.92 s, median 7.92 s, mean 7.92 s, max 7.92 s, allocs 2
load      => min 39.3 μs, median 47.1 μs, mean 60.3 μs, max 65.1 ms, allocs 151
coords    => min 7.68 s, median 7.68 s, mean 7.68 s, max 7.68 s, allocs 2
readpts   => min 7.76 s, median 7.76 s, mean 7.76 s, max 7.76 s, allocs 159

Reading files with LASzip CLI

» time laszip -i las-autzen_trim.las -nil
________________________________________________________
Executed in   11.32 millis    fish           external
   usr time    7.17 millis  306.00 micros    6.86 millis
   sys time    4.11 millis  191.00 micros    3.92 millis

» time laszip -i laz-autzen_trim.laz -nil
________________________________________________________
Executed in   64.83 millis    fish           external
   usr time   64.14 millis    0.00 micros   64.14 millis
   sys time    0.50 millis  503.00 micros    0.00 millis

» time laszip -i 6413c497d34eb496d1ce956e.las -nil
________________________________________________________
Executed in  766.38 millis    fish           external
   usr time  670.81 millis  348.00 micros  670.46 millis
   sys time   94.02 millis  232.00 micros   93.79 millis

» time laszip -i 6413c497d34eb496d1ce956e.laz -nil
________________________________________________________
Executed in    6.06 secs    fish           external
   usr time    6.01 secs  505.00 micros    6.01 secs
   sys time    0.04 secs  343.00 micros    0.04 secs
mfsch commented 6 days ago

Do they exist for https://geotiles.citg.tudelft.nl ? It does seem like the most official source to me, but I'm only getting started with this.

I’ve opened a new issue #3 for this – it looks like the AHN is the official source for the Netherlands.

SimonDanisch commented 6 days ago

That's amazing :) btw the Tyler integration now exists: https://makieorg.github.io/Tyler.jl/dev/map-3d It's not perfect, but works quite well already

mfsch commented 6 days ago

Would it make sense to recombine the reader/writer into e.g. LasIO.jl from PointClouds + LASDatasets?

I feel like an IO library should really just parse the format and do nothing on top - and it seems wasteful to have two (3?) implementations of the same reader/writer functionality.

I understand that the situation with LAS/LAZ packages is a bit unsatisfying at the moment. In my understanding, we’ve first had LasIO.jl with Julia-native LAS parsing but without LAZ support. Then LazIO.jl added a wrapper around the LASzip library. However, both these projects haven’t been actively developed in a while and don’t support the latest LAS version 1.4 released in 2019.

I originally started working on adding LAS 1.4 to LasIO.jl, but in the process I saw more and more things I wanted to do differently so I ended up doing a clean-slate version from scratch. I guess LASDatasets.jl went through a similar process, and we were unaware of each other’s work until JuliaCon this year. So now there are two recent LAS packages, and I guess they both fit our respective needs at this time. While I think it may be useful to have a more unified ecosystem for point-cloud processing eventually, I also think it’s fine for multiple packages to exist in parallel for now, especially since two of them are very new and may still develop in different directions as they mature.

With PointClouds.jl we’re trying to build up the tools for working with point-cloud (especially lidar) data in Julia with a focus on both performance and ergonomics. This means that the different steps of the workflow need to play nicely together, which is why we currently have a single package with a somewhat broad scope. We’re keeping the dependencies pretty limited though, so it’s shouldn’t be an overly heavy package. Eventually we might want to pull some of the functionality into separate packages, but I think it makes sense to let the code evolve a bit more before that.

mfsch commented 5 days ago

I’ve released v0.4.0 with the read performance improvements, so I’ll close this issue now.

@SimonDanisch: Just a quick heads-up for Tyler: I’ve disabled getproperty for individual points, since it’s easy to get unexpected results with those (coordinates are unscaled, properties can differ between point formats). You’ll therefore have to replace pt.coords with coordinates(Integer, pt).