hugoledoux / startinpy

A Python library for modelling and processing 2.5D terrains using a (2D) Delaunay triangulation.
https://startinpy.rtfd.io/
MIT License
25 stars 6 forks source link

Add PDAL to comparison matrix #23

Open gadomski opened 3 weeks ago

gadomski commented 3 weeks ago

As a part of https://github.com/openjournals/joss-reviews/issues/7123, I used both startinpy and PDAL to create a terrain mesh from a USGS 3DEP point cloud. PDAL uses delaunator-cpp for its triangulation, wrapped in filters.delaunay. Because PDAL has a Python API and has good performance, it might be good to include it in the comparisons. ~I can't add it to the comparisons myself because they use @hugoledoux local files.~ EDIT: I see that there's links to download the performance testing files, so I'm going to take a stab at adding PDAL myself 🙇🏼.

I don't want to assume that I'm doing things correctly, but in my tests I found that PDAL + delaunator-cpp was significantly faster in a simple build-terrain operation. I did my best to ensure that I was using a release profile when building startinpy but of course could have made an error.

cc @kylemann16 who is a PDAL developer.

Comparison

I compared performance using a tiled COPC, fetched from Microsoft's Planetary Computer using stac-cli, jq, and stac-asset:

stac search https://planetarycomputer.microsoft.com/api/stac/v1 \
    -c 3dep-lidar-copc \
    --intersects '{"type":"Point","coordinates":[-105.119,40.173]}' \
    | jq '.features[0]' \
    | stac-asset download

startinpy

Code ```python import time import laspy import numpy import startinpy start = time.time() las = laspy.read("USGS_LPC_CO_SoPlatteRiver_Lot2a_2013_13TDE489446_LAS_2015.copc.laz") points = las.points[las.classification == 2] points = numpy.stack((points.x, points.y, points.z)).transpose() read = time.time() dt = startinpy.DT() dt.insert(points) created = time.time() dt.write_ply("dt-startinpy.ply") done = time.time() print(f"Read: {read - start:.3f}s") print(f"Create: {created - read:.3f}s") print(f"Write: {done - created:.3f}s") print(f"Total: {done - start:.3f}s") ```
Read:   2.320s
Create: 43.147s
Write:  7.384s
Total:  52.851s

PDAL

Code ```python import time from pdal import Filter, Reader pipeline = Reader.las( filename="USGS_LPC_CO_SoPlatteRiver_Lot2a_2013_13TDE489446_LAS_2015.copc.laz" ) | Filter.range(limits="Classification[2:2]") start = time.time() pipeline.execute() read = time.time() pipeline = Filter.delaunay().pipeline(pipeline.arrays[0]) pipeline.execute() created = time.time() mesh = pipeline.get_meshio(0) mesh.write("dt-pdal.ply") done = time.time() print(f"Read: {read - start:.3f}s") print(f"Create: {created - read:.3f}s") print(f"Write: {done - created:.3f}s") print(f"Total: {done - start:.3f}s") ```
Read:   3.916s
Create: 3.943s
Write:  1.128s
Total:  8.987s

QC

Comparing statistics ```shell $ pdal info dt-startinpy.ply | jq .stats.statistic [ { "average": 489726.2401, "count": 2872483, "maximum": 490500, "minimum": 489000, "name": "X", "position": 0, "stddev": 432.9948745, "variance": 187484.5614 }, { "average": 4446745.876, "count": 2872483, "maximum": 4447500, "minimum": 4446000, "name": "Y", "position": 1, "stddev": 439.3933042, "variance": 193066.4758 }, { "average": 1533.628089, "count": 2872483, "maximum": 1554.64, "minimum": 1515.02, "name": "Z", "position": 2, "stddev": 7.959375129, "variance": 63.35165245 } ] $ pdal info dt-pdal.ply | jq .stats.statistic [ { "average": 489726.2391, "count": 2872606, "maximum": 490500, "minimum": 489000, "name": "X", "position": 0, "stddev": 432.996162, "variance": 187485.6763 }, { "average": 4446745.877, "count": 2872606, "maximum": 4447500, "minimum": 4446000, "name": "Y", "position": 1, "stddev": 439.392411, "variance": 193065.6908 }, { "average": 1533.628108, "count": 2872606, "maximum": 1554.64, "minimum": 1515.02, "name": "Z", "position": 2, "stddev": 7.959359551, "variance": 63.35140446 } ] ```
Images from QGIS The input point cloud: ![copc](https://github.com/user-attachments/assets/72109ac3-6c59-4b2e-9b94-f4e967b2e39b) The output mesh (both **startinpy** and **PDAL** produced the same-enough output): ![ply](https://github.com/user-attachments/assets/15c4f8c1-1c6f-4e86-9683-6e4f1a5ff099)
hobu commented 3 weeks ago

Filter.range(limits="Classification[2:2]")

Filters.expression(expression="Classification == 2") <-- much better modern PDAL way to the same thing.