connormanning / entwine

Entwine - point cloud organization for massive datasets
https://entwine.io
Other
441 stars 129 forks source link

Computation complexity in relation to the number of *.laz files vs total size of *.laz files #311

Closed Mboga closed 1 year ago

Mboga commented 1 year ago

Thank you for your response to question I raised before.

I had a look at your Foss4G presentation in 2016, and at minute 15:08 and 15:25, list the point cloud characteristics and infrastructure, respectively used to create the entwine database using AHN of the Netherlands.

I wanted to seek a clarification on computation complexity and how much time it takes to build an entwine database. I understand that I can use the subsetoption in the entwine build command.

I will use an example to make my point. Consider that I have a dataset that has 41000 .laz files organized in regular grids, and say 1.5TB How would the processing time and computation complexity increase if- A) The same dataset was instead available as 410000 laz files?

B) If I had second dataset that was 3 TB, would it be advisable to reduce the number of *.laz files before building the entwine database?

In other words, I would like to understand how the processing time and computation complexity increase based on the number of .laz files and the total size of the input .laz files.

Thank you

Mboga commented 1 year ago

Here is the link to the Foss4G presentation 2016 https://ftp.gwdg.de/pub/misc/openstreetmap/FOSS4G-2016/foss4g-2016-1204-500_billion_points_organizing_point_clouds_as_infrastructure-hd.mp4

connormanning commented 1 year ago

I'd expect this to have very little or no impact on performance but you would be best off trying a representative subset from your own dataset.