pgpointcloud / pointcloud

A PostgreSQL extension for storing point cloud (LIDAR) data.
https://pgpointcloud.github.io/pointcloud/
Other
398 stars 108 forks source link

Parallel support #235

Open gabri94 opened 5 years ago

gabri94 commented 5 years ago

Hi, I'd like to take advantage of the parallelization offered by the latest versions of Posgres. I've tried to run the following query to check whether the planner would have executed it in parallel.

 SELECT pa
    FROM lidar_table
    WHERE PC_Intersects(pa, ST_GeomFromText('POINT(43.1 11.8)', 4326))

However I realized that it wasn't the case. I saw that the function PC_Intersects is marked as 'PARALLEL SAFE' What could be the problem?

autcrock commented 4 years ago

Hi all.

I'm interested in this issue too.

I'll aim to work through analyzing planner output, but if there is any advice on what I need to consider with respect to the pointcloud context would appreciate it.

Cheers

Mike Thomas

gabri94 commented 4 years ago

In the end to achieve better performance i dropped postgis and pointcloud for the raster managment at all. Now I am directly working with raster files from python (using rasterio) and my performance are 100/1000x better

Remi-C commented 4 years ago

Hey @gabri94 , this is typically the type of query that should have been using the index. To make it short, you should have small pa, (up to a few millions), indexed. I'm not sure I understand the use of PC_intersects in this case. most likely you dont need a pc function, but rather a postgis function here. (overlap of pa bounding box and your point bounding box). If you don't use indexes in your workflow, postgres/postgis/pgpointcloud is very useless, you might as well do a brute force solution.

gabri94 commented 4 years ago

I was using the index obviously, but since i had to run 2Billion queries on the same DSM it would have took 30 days to finish. i ended up removing the DBMS from the equation and load the whole DSM in memory at once. I don't think there's a problem with pointcloud, rather that my use case was not suited for it.

Remi-C commented 4 years ago

Sure, point cloud and databases are only useful in some situations. 2 Billions queries is a lot!

autcrock commented 4 years ago

Thanks Gabriel and Remi.