wonder-sk commented 3 years ago

QGIS Enhancement: Point Clouds in QGIS

Date 2020/08/10

Author Martin Dobias (@wonder-sk)

Contact martin.dobias at lutraconsulting.co.uk

Maintainer @wonder-sk

Version QGIS 3.18

Crowdfunding https://www.lutraconsulting.co.uk/crowdfunding/pointcloud-qgis/

Summary

The aim of this proposal is to prepare infrastructure within QGIS to allow working with point cloud data. In the initial phase we are only concerned with visualization of the data - no editing or processing functionality is considered at this stage. The goal is to read most commonly used formats such as LAS/LAZ. We want to offer both 2D and 3D rendering with basic styling options (such as choosing which attribute to use and choice of colors and point sizes).

point-cloud-trencin Point cloud data from airborne laser scanning, coloring by classification, visualized in CloudCompare (Data from UGKK SR)

Introduction

A point cloud is a set of points in 3D space, most often with some extra attributes such as color, classification and others. Point clouds are most often coming from two main sources:

Laser scanning (Lidar) - coming from airborne, terrestrial or mobile survey. Measurements are done by illuminating the target with laser light and measuring reflection with a sensor.
Photogrammetry - combining multiple pictures of a scene, software detects the position pictures were taken and using pairs of pictures with overlapping parts of the scene to estimate the depths.

Point clouds are an increasingly popular type of data used in GIS, especially as the price of the technology is going down. Even consumer grade electronics such as the newer iPad models include lidar functionality, and applications are available which can create point clouds from these devices. Nowadays point clouds from laser scanning are commonly used to create high resolution digital elevation models (DEM).

The main challenge of using point cloud data comes from the sheer number of points in datasets: we need to deal with many millions to billions of points for rather small geographical areas, with country-wide surveys getting to trillions of points. With this amount of data it is not simply possible to convert the dataset to “ordinary” point vector layers - typical GIS tools and data formats like GeoPackage or Shapefile are not optimized for such amounts of data and any operations would take very long time.

point-cloud-top-elev Point cloud data from airborne laser scanning, coloring by elevation, visualized in CloudCompare (Data from UGKK SR)

Design

We will introduce a new map layer type within QGIS: point cloud layer, which will be used for all point cloud data. As mentioned above, we can’t reuse existing vector layer type (with point geometries) as it is not designed to support such large amounts of points.

For access to file-based data we have decided to use PDAL (“point data abstraction library”). It follows the approach taken by GDAL library: it provides a common interface to read, write or process point cloud data and various drivers (“readers” and “writers” in PDAL terminology) take care of specifics of different formats. In addition to that, PDAL comes with a concept of pipelines where one can combine various processing algorithms - probably we would support them at some point later after the initial implementation.

Next thing we need to consider is how to efficiently display the data. Iterating over all points in a dataset and drawing them is simply not going to work well and it would take too long. It would be also quite wasteful - when zoomed out, only a small amount of points need to be rendered. A second case to consider is that when a user zooms in a lot, we should be able to render only points within the area of interest rather than reading points outside of the current view. So we need a way to quickly query points in these different situations. This need is commonly solved by the use of space partitioning data structures (such as octrees) which hierarchically divide space into smaller volumes. An important feature is that not just leaves, but also internal nodes contain data: thanks to that we can not only quickly query points in a small area of interest, but also we can also get smaller portions of points at different zoom levels.

Point cloud data in commonly used formats like LAS are rather simple sets of points which do not have any indexing data structure associated, and the order of points may be arbitrary. Therefore we need to build such indexing data structure when opening these files in order to be able to render them - this is what other viewers (such as CloudCompare) do as well. As a part of indexing, we need to reorganize the points in order to be able to quickly access them when rendering. When indexing is done, we should have a tree data structure where each node contains a subset of the point cloud - such indexed point cloud may be either stored on the disk or kept in memory.

There are already various formats of indexed point clouds used by different pieces of software. All these formats are based on the same concept of using a hierarchical structure to partition the space typically using octree, having subsets of data in internal tree nodes:

Entwine Point Tile - generated by Entwine
Potree format - generated by PotreeConverter (actually two different formats - one in v1 and another one in v2)
3D Tiles
ESRI I3S / SLPK

It is desired to support some of these indexed formats natively in QGIS. PDAL already has some support, but it does not yet provide APIs for more fine-grained control (e.g. to get description of a particular node in the hierarchy, get data of just that node). It will be discussed with the PDAL community whether access to index hierarchy of such formats could be provided through PDAL APIs.

Implementation

Overview of the proposed new classes:

QgsPointCloudLayer - implementation of the new layer type, derived from QgsMapLayer
QgsPointCloudDataProvider - abstraction of data access, derived from QgsDataProvider
QgsPdalDataProvider - implementation of QgsPointCloudDataProvider
QgsPointCloudRequest - similar to QgsFeatureRequest for vector data, tells the data provider what subset of data to fetch
QgsPointCloudIterator - similar to QgsFeatureIterator for vector data, abstracts iteration over point cloud data
QgsPointCloudBlock - encapsulation of point cloud data, returned from iterator, contains a set of points
QgsPointCloudAttribute - definition of an attribute associated with each point (XYZ position, color, classification, intensity, …)
QgsPointCloudIndexNode - a node in the hierarchy of indexed point cloud, interface to be implemented by subclasses
QgsPointCloudRenderer - a base class for 2D renderers of point clouds, similar to QgsFeatureRenderer for vector layers
QgsPointCloudSimpleRenderer - implementation of QgsPointCloudRenderer that will provide basic 2D rendering of data
QgsPointCloudLayerRenderer - implementation of 2D rendering of point cloud layers, derived from QgsMapLayerRenderer
QgsPointCloud3DRenderer - implementation of 3D rendering of point clouds, derived from QgsAbstract3DRenderer
QgsPointCloudBlockCache - in-memory cache of previously fetched data - mainly for 2D rendering (for 3D rendering data are cached in GPU)

Indexing

If a data provider has an indexing structure that our implementation understands, it will provide access to its root node (a subclass of QgsPointCloudIndexNode). Point cloud layer will use it for querying data. Otherwise the data provider would return null root node, indicating that QgsPointCloudLayer needs to build its own index. This would be done in a background worker thread after the layer gets loaded. While the index is being built, rendering (2D or 3D) would not be available. Indexing would either be just in-memory or with data stored in a temporary location on disk to avoid high RAM consumption. When using disk storage, it would be worth considering using an indexing format used elsewhere (e.g. by Entwine, Potree, 3D Tiles). Storage of index on disk would also avoid the need to run indexing again when loading data in QGIS later.

To be discussed with the PDAL community what are the options of code reuse: currently PDAL does not offer access to indexing structures even if the underlying formats have them (such as EPT or I3S). The ideal solution would be if PDAL offered an API that QGIS could use for indexed point cloud formats, so that QGIS does not need extra providers for them. Similarly, it would be great if point cloud indexing could be done within PDAL. Currently PDAL does not offer such functionality, however there is a separate project Entwine from the makers of PDAL which does take care of creating indexed point clouds even for huge datasets.

QgsPointCloudIndexNode will be the base class for nodes in the hierarchy of point cloud indexes. It is up to the underlying implementation (which can be provider-specific) what kind of tree structure will be used as long as the following rules are fulfilled:

Each node has assigned bounding volume (axis-aligned bounding box) - no points referenced in this node or its child nodes are allowed to be outside of it
Each node has a subset of point cloud data assigned
Each node has specified maximum geometrical error (given in map units) - this information is used by rendering to determine whether this node has enough precision for the given view or whether its children need to be used, too
Each node may have zero or more children Octrees are probably the most commonly used data structure for this purpose.

One complication is that for various point cloud data sources the whole hierarchy may not be immediately available - for example, when fetching data about root node, we also get information about few more levels deeper, but to access further descendants, another network request may need to be done. Loading of nodes may therefore need to be implemented in an asynchronous way and a signal would get emitted whenever a new node gets loaded.

2D Rendering

The whole rendering process will be driven by the QgsPointCloudLayerRenderer class (where the bulk of the work will be done in a worker thread as is the case with other layer renderers). In the first stage, we will figure out the acceptable geometry error (i.e. acceptable spacing between points) based on the map scale - for example, at the scale 1:1.000.000 we may be fine with 1km spacing between points, but at scale 1:1.000 we would need much lower, e.g 1 meter spacing. This setting will be exposed to users to control as a “dynamic filtering” setting, in case the user desires to override the default setting (e.g. for a higher quality render or faster rendering time).

Then we will traverse the index hierarchy and find nodes intersecting the map view extent and having sufficient geometry error. Finally we would fetch point cloud data from these nodes - either from the layer’s cache (if they are cached in memory already) or from the index nodes, and draw points one by one. Drawing of individual points would be handled by a subclass of QgsPointCloudRenderer.

In the initial work we plan one renderer implementation: QgsPointCloudSimpleRenderer. It would draw points with fixed size, the color of points would be determined by one of the attributes picked by the user (e.g. elevation, classification, return number) and chosen color ramp. Additionally, some basic filtering options should be supported (e.g. only the last return, only particular value(s) from classification, only a particular range of elevations).

In the future more renderers may be added, such as display of interpolated surface (calculated on the fly from point data), hillshade or contours.

3D Rendering

At the core, rendering of point clouds in 3D follows the same principles as the 2D rendering: using the hierarchical index, it will be determined which nodes should be displayed, based on that the point data would be loaded and displayed in the 3D scene. Some of the necessary infrastructure is already in place: QGIS 3D views support “chunked” rendering of terrain since the beginning (as the user gets closer to the terrain, more detailed elevation and map textures get rendered). As the mechanism is fairly generic and has been extended to vector layer data in the past, it can be extended for point clouds as well. Currently it has the limitation of always expecting a quadtree hierarchy, but this can be updated to allow more general tree hierarchies (octree or others).

The implementation would start with QgsPointCloud3DRenderer class, which would contain styling configuration and it would provide implementation of a 3D entity (Qt3DCore::QEntity) based on the existing QgsChunkedEntity class. The implementation will add a class derived from QgsChunkLoader which will handle fetching of point cloud data of a single node (chunk) and setting up vertex buffers, attributes and material for each node.

Styling options for 3D views should be similar to the options in 2D rendering: configuration of point size, coloring based on a single attribute and some simple data filtering options.

It would be very useful to also add the eye-dome lighting effect which improves the depth perception. For comparison - the same point cloud (colors according to classification), left image without eye-dome lighting, right image with eye-dome lighting:

Point cloud rendering, color coding based on classification (ground = blue, green = vegetation). Left: without eye-dome lighting. Right: with eye-dome lighting.

References

PDAL - Point Data Abstraction Library https://pdal.io/
Entwine - indexing of point clouds (built on top of PDAL) https://github.com/connormanning/entwine/
Potree - WebGL point cloud viewer https://github.com/potree/potree
CloudCompare - desktop app for point cloud visualization https://github.com/CloudCompare/CloudCompare
PotreeConverter - indexing of point clouds https://github.com/potree/PotreeConverter
Potree: Rendering Large Point Clouds in Web Browsers by Markus Schütz https://www.cg.tuwien.ac.at/research/publications/2016/SCHUETZ-2016-POT/SCHUETZ-2016-POT-thesis.pdf
3D Tiles (includes specification for point cloud data) https://github.com/CesiumGS/3d-tiles

Jean-Roc commented 3 years ago

hi martin, by itself the disponibility of PDAL as a provider along gdal and mdal would be a great addition !

Using ept-tools would allow to serve an index/octree built by entwine to3dtiles/i3s, it brings us back to the time where opening a raster in a gis software would have triggered a dialog to build pyramids.

I differ on the genericity of the 3d rendering with this kind of load, an approach based only on height and volume hierachy will quickly hits a performance wall without culling methods (depth, visibility, etc.).

jfbourdon commented 3 years ago

Just a few days ago I was looking for a plugin to view LAS inside QGIS but didn't found any. However, I found an open-source project that can be used for inspiration : displaz. You probably already have some work done and it may not help, but I really like the way the app is fluid and the way depth perception is added (the points are bigger the closer they are to the observer).

Also for inspiration, an other simple viewer is the proprietary FugroViewer. I use it a lot to view profil cuts/slices. It can also render in its 3D view a points and polygons from 3D shapefiles. It would be great if this future point clouds viewer could also display vectors and even raster for added context.

I can't personally help with the backend ideas, but as a regular LiDAR user, I have some ideas about which fonctionnalites would be helpful.

hobu commented 3 years ago

it brings us back to the time where opening a raster in a gis software would have triggered a dialog to build pyramids.

The data quickly get too large to support rendering without some kind of preprocessing to organize the data. We can optimize for a specific processing task or we can optimize for multi-resolution visualization and access (EPT-style). The nice thing about Potree + EPT is it is well tested and does a great job on the pure visualization end of the spectrum. EPT's content encoding can be LAZ, but it can be compressed blobs of whatever too, and this gives a lot of flexibility for situations where the data don't fit an LAS model.

The downside to Entwine/EPT is it can be expensive to organize the data. It's a one-time cost, but it hurts. There are public registries of data like https://usgs.entwine.io, but it still costs. We could organize EPT as a kind of single-file "MBTiles" layout instead of exploded data convenient for online access as it is now, but that up-front cost would still be there.

Whether you choose EPT or choose to build something like it, in my opinion you definitely need something like it to succeed.

Some other considerations this QEP will need to discuss include:

Coordinate system considerations
Attribute mappings https://pdal.io/dimensions.html
PDAL streaming vs. non-streaming modes
User-controlled PDAL pipelines?
EPT addon support? https://pdal.io/stages/writers.ept_addon.html

roya0045 commented 3 years ago

In your data structure I assume that you could also have a way to limit the number of node renderer. If you use a region based approach linked to the scale I feel like you could limit the number of points to be rendered and still have enough details visually. Though this might be an issue if you want to do such operations as selecting data with the interface. But for pure visualisation it might be optimal.

I'm suggesting something similar to what is done with vector tiles and as suggested by Mr. Butler above.

skinkie commented 3 years ago

Instead of "rendering", would an OpenGL / EGL implementation also an option?

wonder-sk commented 3 years ago

@hobu thanks for the list of topics not covered by the QEP so far!

Coordinate system considerations

As long as we can get SpatialReference from PDAL's reader class and extract WKT, things should be "easy" - the dataset's coordinate reference system (CRS) would be reported by the QgsPdalDataProvider (just like with other QGIS data providers) and we would let QGIS core classes handle CRS and on-the-fly transformations (including datum shifts) - with the help of PROJ of course :-)

Attribute mappings https://pdal.io/dimensions.html

This is indeed something where I do not have clear idea how to best do it. Originally I wanted to make things very simple and only aimed towards visualization - for example to always return for each point just XYZ + one optional attribute for visualization (and thus fixing data size of a point to 12 or 16 bytes, assuming that data types would be also fixed). I guess having more flexible approach with any number of attributes with arbitrary order and data types would be nicer and more flexible, possibly at the expense of speed of fetching data...? Happy to hear suggestions...

PDAL streaming vs. non-streaming modes

Streaming mode would be of course highly preferred. I see that not all readers support streaming mode, although probably the majority would not be too difficult to convert to streaming mode (but of course, it would be still quite some work). So, QgsPdalDataProvider would simply need to support both options. On QGIS side I would prefer to handle all data sources as if they were in streaming mode so that the client code does not need to deal with two options.

User-controlled PDAL pipelines?

In the initial phase I would stick to just predefined pipelines (which I assume would be most of the time just a single reader), but keep this functionality in mind for future. To support user-controlled PDAL pipelines nicely, we would probably need to do a lot of plumbing on the GUI side to make sure we can offer a good experience - grouped list of available modules, widgets for configuration of possible parameters, handling of defaults and so on. It could be really nice though if we could use the Layer Styling panel in QGIS to let user manipulate the pipeline and immediately see the results on 2D/3D canvas...

EPT addon support? https://pdal.io/stages/writers.ept_addon.html

This would be a nice feature to have, but probably not within the initial implementation to keep the scope limited...

wonder-sk commented 3 years ago

@Jean-Roc

Using ept-tools would allow to serve an index/octree built by entwine to3dtiles/i3s

Using Entwine code to build the index for visualization is indeed something we are exploring...

I differ on the genericity of the 3d rendering with this kind of load, an approach based only on height and volume hierachy will quickly hits a performance wall without culling methods (depth, visibility, etc.).

I am not sure if I understand the point you are making... In QGIS 3D we already have chunked approach to loading and visualizing data. Whatever is far from the camera will get displayed only with low amount of detail (or not at all), whatever appears right in front of the camera will be shown in sufficient detail (defined for each "chunk" as maximum allowed geometry/texture error in world coordinates, which gets transformed to screen space error, and finer/coarser chunks would be used if the error is outside of the acceptable range).

@jfbourdon

I found an open-source project that can be used for inspiration : displaz.

Thanks for the pointer, always useful to get some inspiration from elsewhere.

It would be great if this future point clouds viewer could also display vectors and even raster for added context.

Vectors and rasters are already supported in QGIS 3D map - only point clouds are missing :-) When this is implemented, any vector and raster data should nicely blend together...

I can't personally help with the backend ideas, but as a regular LiDAR user, I have some ideas about which fonctionnalites would be helpful.

Feel free to post your wish list of functionality - it's good to know what people are keen to see in QGIS.

@roya0045

In your data structure I assume that you could also have a way to limit the number of node renderer. If you use a region based approach linked to the scale I feel like you could limit the number of points to be rendered and still have enough details visually.

Maybe I misinterpret your statement, but I think this is covered in the QEP: the indexing structure should contain point data also in the internal nodes of the tree hierarchy, not just the leaves. Therefore if you are zoomed out, you would be only looking at a small subset of points coming from the root node (or maybe a level or two deeper)...

@skinkie

Instead of "rendering", would an OpenGL / EGL implementation also an option?

3D rendering is done with Qt3D framework which used OpenGL under the hood. 2D rendering would not be done with OpenGL (at least for now).

hobu commented 3 years ago

Coordinate system considerations

As long as we can get SpatialReference from PDAL's reader class and extract WKT, things should be "easy" - the dataset's coordinate reference system (CRS) would be reported by the QgsPdalDataProvider (just like with other QGIS data providers) and we would let QGIS core classes handle CRS and on-the-fly transformations (including datum shifts) - with the help of PROJ of course :-)

PDAL's filters.reprojection will allow you to reproject points in-line with other PDAL pipeline processing operations. It simply defers to GDAL/PROJ, but the per-point cost might be lighter weight than bringing it up to a QGIS vector instance and using its reprojection operation.

Another SRS consideration is what to do about coordinate systems that aren't rectilinear. Disallow? Warn? Act like nothing is different?

wonder-sk commented 3 years ago

PDAL's filters.reprojection will allow you to reproject points in-line with other PDAL pipeline processing operations. It simply defers to GDAL/PROJ, but the per-point cost might be lighter weight than bringing it up to a QGIS vector instance and using its reprojection operation.

Using PDAL's filters.reprojection could be an option, however that would require some more changes in QGIS core: at this point QGIS data providers are not expected to be able to do reprojection, only to report their CRS - the information about the CRS of the map view in QGIS does not even get to the data provider. QGIS also provides low-level API for CRS transform with PROJ without having to wrap coordinates into more complex structures. One more advantage is that QGIS also has a coordinate transform context which captures user's preferences about the most appropriate transform in case there are multiple transforms between a pair of CRS. So my preference would be to stick to QGIS functionality...

Another SRS consideration is what to do about coordinate systems that aren't rectilinear. Disallow? Warn? Act like nothing is different?

In QGIS we leave that up to the user. So if input data are in EPSG:4326 we would by default display data in that CRS. There are some cases where some action is taken: for example advanced digitizing dock widget. Similarly, QGIS 3D view will refuse to open if project CRS is lat/lon. I guess this consideration is not specific to point clouds - it applies to all kinds of map layers in QGIS...

Jean-Roc commented 3 years ago

@wonder-sk, a few examples :

when I try to display a DEM (TIF COG, LIDAR, 0.3cm/px) with a 64px tilemap resolution and set an oblique view, the screen will flicker for a long time but not if I strongly limit the angle of sight
when displaying a extruded vector layer (+600k) with a default extrusion height on plane, it seems to load the whole layer and show a lot of artefacts when set to an oblique view

If needed I can open issues, I've not done it yet because it seems to me to be linked to frustrum culling methods and other rendering optimizations for "large" projects which would need to be funded (which I have not at this time). These issues are present for raster/vector and, in my point of view, will be even more visible with point cloud datasets.

wonder-sk commented 3 years ago

@Jean-Roc ah okay I see. It would be worth filing tickets for those issues with some sample data. Large DEM: it could be that the culling strategy is not working well in some cases. Big vector layer: yes it gets fully loaded at this point, we need some sub-sampling of data to avoid that (a missing feature). Artifacts: Probably yet another unrelated issue...

luipir commented 3 years ago

Hi @wonder-sk

Can you explain the need to create a new classes for request, iterator and attributes? QgsPointCloudRequest QgsPointCloudIterator QgsPointCloudAttribute

wonder-sk commented 3 years ago

@luipir The idea is that we need some way interface to fetch point data. And when fetching point data, we may want to fetch only a subset of data, for example:

fetch only some attributes instead of all of them
apply spatial filtering using a map extent rectangle
apply filtering based on an attribute (e.g. only last return)

Configuration of the data request should be handled by QgsPointCloudRequests - in fact very similar to how QgsFeatureRequest works.

Once we have a request ready and we let the data provider handle it, we may not want to immediately fetch all points (which may consume a lot of memory) - rather we want to fetch data in smaller blocks, for example a couple of thousands of points at a time. This is where QgsPointCloudIterator would come handy - data provider gives us an iterator and we would be fetching points from it as needed. Again, very similar to QgsFeatureIterator, just we would be fetching many points at once instead of just a single vector feature which is the case of QgsFeatureIterator.

And finally QgsPointCloudAttribute is a class that should encapsulate attributes of point cloud data - this would be attribute name, data type (e.g. int32, float, double, ...) and maybe some more information if needed. It would be similar to QgsField which you know from vector layers.

luipir commented 3 years ago

@wonder-sk about QgsPointCloudIterator seems a prefetching virtual method (private because used by iterator) can satisfy managing of huge data and can support also other providers if implemented about QgsFeatureRequests I would add a 3d polygon or frustum constructor to manage QgsPointCloudRequests about QgsPointCloudAttribute I can't see any improvement adding a new class doing the same

I still can't see need of a new classes, but probably because I didn't dig into the implementation problems... I'm remaining at design level.

wonder-sk commented 3 years ago

about QgsPointCloudIterator seems a prefetching virtual method (private because used by iterator) can satisfy managing of huge data and can support also other providers if implemented

Sorry I am confused about what you suggest here... Could you be more specific? How would the virtual method look like, what would it return? And how do you mean it with prefetching?

about QgsFeatureRequests I would add a 3d polygon or frustum constructor to manage QgsPointCloudRequests

Frustum-based culling would be done by using octree indexing structure and the existing QgsChunkedEntity implementation, which would decide whether a node of the octree would be fully rendered or not rendered at all. I agree that filtering by a 3D box or extruded polygon would be useful (but I am not sure if we will need it in the initial implementation).

about QgsPointCloudAttribute I can't see any improvement adding a new class doing the same

Do you mean compared to QgsField? It's not the same thing, just similar concept, but the details are different. For example, there are various "standard" attributes in point clouds that have known meanings (and data types) while that's not the case for vector layer fields... Or data types are only numeric and we care a lot about the exact type (32 vs 64 bit, signed vs unsigned) while in QgsField we widely use a generic QVariant::Int for any integer...

nyalldawson commented 3 years ago

I agree that filtering by a 3D box or extruded polygon would be useful (but I am not sure if we will need it in the initial implementation).

I'd like to see this added in QgsPointCloudRequests, as it would make it possible for the 2d renderer to easily filter by z value and show horizontal "slices" of data

SpatialDigger commented 3 years ago

I'm a digital archaeologist I/we use PDAL to store pointclouds in patches PostGIS with the pgpointcloud extension. I can view the pointclouds using SQL through the explode() syntax. Currently I export pointclouds/meshes to ply so I can display them via python/C++ bindings. Staying in a GIS environment would really improve the workflow of our teams.

For future development: I'm also trying to store the solid meshes in PostgreSQL+PostGIS, I wrote a parser to do so but the meshes are too complex to be presented even through raw sql (e.g. ST_Volume).

I have, for a 10x10x3m trench 701 GNSS georeferenced (UTM) pointclouds, I think this is 56 billion points, they represent 444 volumes. In my day job I also work with LiDAR 2D and 3D pointclouds.

Archaeological datasets are usually amongst the most complex of any industry as they reflect the real world and must be accurate and are not overly compatible with GeoDesign concepts/software. If something works for archaeology it usually works elsewhere (Smart Cities etc.). If you need access to some test datasets, upon discussions with my boss, I might be able to release some to you and provide user testing through our team, please get in touch if interested.

jedfrechette commented 3 years ago

Attribute mappings https://pdal.io/dimensions.html

This is indeed something where I do not have clear idea how to best do it. Originally I wanted to make things very simple and only aimed towards visualization - for example to always return for each point just XYZ + one optional attribute for visualization (and thus fixing data size of a point to 12 or 16 bytes, assuming that data types would be also fixed). I guess having more flexible approach with any number of attributes with arbitrary order and data types would be nicer and more flexible, possibly at the expense of speed of fetching data...? Happy to hear suggestions...

I'd strongly advocate for a more flexible approach. Much of the point cloud data we work with doesn't fit nicely in to a traditional "LAS" data model so PDAL's builtin ability to work seamlessly with arbitrary point attributes has been extremely valuable. Even from just a visualization standpoint, having only one attribute seems like it will become limiting pretty quickly. How would you handle the common case where points have RGB color values?

As a user I would want to have very similar symbology options to what are available for raster layers in QGIS:

Map 3 attributes/dimensions/bands (not necessarily R, G, and B) to R, G, and B color channels.
Map 1 attribute to a ramped color map.
Map 1 attribute to a palleted color map. This is just a special case of 2.
Map 1 attribute to transparency.

All the min, max, and contrast stretch setting available for raster layers would be useful for point clouds too.

It would be very useful to also add the eye-dome lighting effect which improves the depth perception.

EDL works reasonably well for most point clouds regardless of their content so it would be great to have as a baseline. However, it would also be nice to consider higher quality shading options if a particular point cloud can support them. For example, if points have good quality normals you can get much clearer geometry renders than if you just have EDL.

it brings us back to the time where opening a raster in a gis software would have triggered a dialog to build pyramids. ...clip... The downside to Entwine/EPT is it can be expensive to organize the data. It's a one-time cost, but it hurts.

and also

EPT addon support? https://pdal.io/stages/writers.ept_addon.html

This would be a nice feature to have, but probably not within the initial implementation to keep the scope limited...

It seems like there are a couple potential use cases that should be supported.

A user may want to just quickly inspect some point cloud in the context of other spatial data. In this case, although they still need to pay the price of indexing, it may not be necessary to retain that index after the current QGIS session.

Alternatively, if the user is going to keep working with the same point cloud or an organization is prepping data that will be accessed by multiple users they would definitely want to cache that index and only update it when necessary. Users should still have the ability to generate caches from inside QGIS, but for this scenario I think the ability to generate the required caches without needing to rely on the QGIS GUI would also be valuable. Depending on the specific scenario, the ideal place for that cache might be a local drive, a shared network drive, or a dedicated repo similar to usgs.entwine.io.

wonder-sk commented 3 years ago

@garynobles

I'm a digital archaeologist I/we use PDAL to store pointclouds in patches PostGIS with the pgpointcloud extension. I can view the pointclouds using SQL through the explode() syntax. Currently I export pointclouds/meshes to ply so I can display them via python/C++ bindings. Staying in a GIS environment would really improve the workflow of our teams.

As this work is planned to make use of PDAL, you should get read access to data in pgpointcloud "for free". By the way, recently the MDAL library has received read support for meshes in PLY format, so in the upcoming version of QGIS you may be able to load it without any extra code...

For future development: I'm also trying to store the solid meshes in PostgreSQL+PostGIS, I wrote a parser to do so but the meshes are too complex to be presented even through raw sql (e.g. ST_Volume).

Not sure what you mean here... QGIS should be able to load polyhedral surfaces stored in PostGIS as multi-polygons and QGIS 3D should be able to visualize them - but if you have any issues with it let's move the discussion to a separate QGIS ticket so that we don't drift away from our topic :-)

If you need access to some test datasets, upon discussions with my boss, I might be able to release some to you and provide user testing through our team, please get in touch if interested.

Sure, it's always useful to have some more sample data for testing. So far I have mainly samples of point clouds from airborne lidar - if your data are from terrestrial lidar or photogrammetry (with RGB), it would be interesting to get hold of that too. Please feel free to contact me by mail...

wonder-sk commented 3 years ago

@jedfrechette Thanks for your input.

As for attributes/dimensions, yeah I am leaning towards a more flexible approach.

Regarding shading/EDL - agreed that pre-calculated normals may give even better visual output, but my assumption (maybe incorrect?) was that most of the time, users don't have data with normals and getting them calculated may need extra knowledge. But yes, if normals are available, ideally we should be able to use them (probably not in the initial implementation though, to limit the scope).

A user may want to just quickly inspect some point cloud in the context of other spatial data. In this case, although they still need to pay the price of indexing, it may not be necessary to retain that index after the current QGIS session.

Right. I am still wondering what would be the best default behavior for index files. I can think of three options:

use a temporary file, delete it when QGIS terminates
cache index files in within QGIS user profile up to some amount of disk space (just like we do with caching of network replies)
save index files next to the original files (permanently)

Each of the options have some pros and cons...

Alternatively, if the user is going to keep working with the same point cloud or an organization is prepping data that will be accessed by multiple users they would definitely want to cache that index and only update it when necessary.

Yeah. The idea is that we may support various kinds of point cloud indexes - e.g. Entwine's EPT, Potree format, 3D Tiles and others. That should tick the box for the use cases where data preparation would be done in advance so that multiple users don't need to spend any time indexing data again and again...

jedfrechette commented 3 years ago

Sure, it's always useful to have some more sample data for testing. So far I have mainly samples of point clouds from airborne lidar - if your data are from terrestrial lidar or photogrammetry (with RGB), it would be interesting to get hold of that too.

They're all toy data sets, but the libe57 samples have a decent variety of small terrestial examples. I can help with more real-world terrestrial samples too if/when you need them.

As an aside, I think the current navigation controls in the 3D viewport work pretty well for the kind of mostly 2D environments you typically have with GIS data or aerial point clouds. I suspect, however, they will start feeling a little clunky in more fully 3D environments like you tend to get with terrestrial point clouds. The tool interaction model might also need some tweaks to make dealing with sparse data sets like point clouds more comfortable compared to working with continuous surfaces like you can do now. I've got some UI opinions about those issues but that's a different discussion.

my assumption (maybe incorrect?) was that most of the time, users don't have data with normals and getting them calculated may need extra knowledge.

I can only speak for our (terrestrial) lidar pipeline, but we go to a great deal of effort to estimate decent normals at the beginning of the process and then carry them through to the end. Visualization is one reason, but analytically they're super useful as well.

I agree though EDL should be a higher priority for the initial implementation since it covers all use cases. Good normals are far from guaranteed to be available.

save index files next to the original files (permanently)

This would probably be my vote. It saves the cost of reindexing if a cloud is used multiple times, offers a low-tech way of preindexing for multiple users, and usage of these types of sidecar files is fairly familiar to users. GDAL external overviews seem like a good analogy.

roya0045 commented 3 years ago

I also think that associating the index with the files is the best option.

jfbourdon commented 3 years ago

I just want to bring that for traditionnal airborne LiDAR in LAS/LAZ format, a spatial index called LAX (more info here) can be created with LASlib to speed up queries. So in that case it is a sidecar file.

On the subject of sidecar files, I don't think that they should be written by default along the original files. In an organisation, the source data may often be protected on a server inside a directory with read-only access. In that case, if the files are read directly, QGIS will not be able to write the index files next to the original files. As a fallback, could QGIS write those inside the user profile directory (for exemple) and hardlink them to the original files?

wonder-sk commented 3 years ago

@jfbourdon I am aware of LAX files, but they are not a good fit for display purposes. They are able to give you answer to "give me all points in this 2D rectangle", they can't answer to "give me 1% of points from the full extent". The latter request is important for display and because of that the LAX files are not so useful. (Another reason is that LAX index just keeps pointers to ranges of points.)

Yeah, for organizations that's indeed common to have read-only files on a network drive and we would need a fallback strategy as you outline. However for such use case in an organization, it would make more sense to get the point cloud dataset pre-indexed with e.g. Entwine/PotreeConverter so that QGIS could then use the data immediately without creation of extra index files.

jedfrechette commented 3 years ago

With regard to indexes and acceleration structures, NanoVDB might be another one to take a look at. It is a new opensource library, developed by Nvidia, that focuses on a subset of OpenVDB's functionality for volumes (and point clouds), namely real-time rendering and collision detection. I don't have a deep enough understanding of the problem domain to know if it would be a good fit but thought I would leave it here in case it is of interest and could be useful.

High level intro video presented at Siggragh 2020 last week (discussion of NanoVDB starts at 9:00):

https://youtu.be/VJBv9lh5kqg?t=540

NanoVDB feature branch in the OpenVDB repository:

https://github.com/AcademySoftwareFoundation/openvdb/tree/feature/nanovdb

wonder-sk commented 3 years ago

In case anyone has missed the announcement, there is an ongoing crowd funding campaign organized as a joint effort by Lutra Consulting, North Road and Hobu:

https://www.lutraconsulting.co.uk/crowdfunding/pointcloud-qgis/

We will appreciate if you help us spread the word!

Jean-Roc commented 3 years ago

PotreeConverter is back to a free licence , by itself its speed would make it a good candidate to create an auxiliary file for most single user cases, the only lacking part would be the missing compression. You could have the generated .bin along the original file like we had rrd/aux for raster.

mriedo commented 3 years ago

You are right PotreeConverter is back to a free licence and it's a killer. Thanks Markus Schutz for his commitment to open source and the amazing technology he brought to the Lidar community ! It radically changed our way to work with this data.

I definitely agree its an excellent candidate for Lidar indexation, optimization, visualisation.

I have been extensively using the new converter and the new format, converted all kind of aerial, lidar, terrestrial, photogrammetric pointclouds, more than 100 billion points. I was able to convert 3800 las files to a single 2 TB indexed file that can be open in potree viewer in a second. You can check some of the data and potree performance on our 3D Lidar geoportal https://sitn.ne.ch/lidar

I just received a mobile lidar of 400km of roads survey with 65 billion points. It was converted in 3 hours, soon online, here a short video : https://sitn.ne.ch/web/diffusion/videos/conv2/conv2.mp4

You mention compression as a lacking part but it's already financed and should be available this month ! Markus also has a clear roadmap to make improvements to this already impressive solution.

The format is also open to LAS attribute extension, which means that you can add any additional attributes to the points : dip, dip direction, normals, segementation, etc. and they are recognized by the converter. Here is a short video showing the extra attributs support and visualization. https://sitn.ne.ch/web/diffusion/videos/dip/dip.mp4

Here is a description of some of our data : https://sitn.ne.ch/web/plaquettes/lidar2016.pdf https://sitn.ne.ch/web/plaquettes/lidar2018.pdf

mriedo commented 3 years ago

For those interested in more information about Lidar, and how it is used, here a Lidar course (in french, but I am working on its translation in english), I recently prepared. https://sitn.ne.ch/web/diffusion/lidar/formation_lidar_sitn_2019.pdf https://sitn.ne.ch/web/diffusion/lidar/formation_lidar_sitn_2019.pptx

gnerred commented 3 years ago

Yes, it would be great to use Potree format for this QGIS enhancement !

monodo commented 3 years ago

Potree format support would provide following benefit (not exhaustive):

Excellent indexing performance thanks to PotreeConverter
Limited number of files created by the converter (version 2) for simplified data transfert
Advanced webclient supporting the rendering of billions of points. See our Potree instance here: https://mapnv.ch/lidar/
cPotree librairy for 2D profile extraction with impressive performances: https://github.com/potree/CPotree Check the Lidar profile tool on https://mapnv.ch to see it live.

m-schuetz commented 3 years ago

Yeah. The idea is that we may support various kinds of point cloud indexes - e.g. Entwine's EPT, Potree format, 3D Tiles and others.

That would be pretty awesome. In Potree, the Potree format and Entwine EPT format both use virtually the same rendering code since they're both just boxes with points inside. It's mainly the loader that differs so it'd be great if there were support for different loaders in QGIS.

EDIT: I'm working on a PotreeLoader for CPotree that should idealy work as a standalone header or .h and.cpp file, so easy to extract and drop-in somewhere else.

roya0045 commented 3 years ago

Congrats on getting the minimum funding needed already! 💯

sfkeller commented 3 years ago

@wonder-sk commented https://github.com/qgis/QGIS-Enhancement-Proposals/issues/194#issuecomment-683332810

As this work is planned to make use of PDAL, you should get read access to data in pgpointcloud "for free".

That's correct: Given that PDAL has reader/writer pgpointcloud of the PostgreSQL pointcloud extension, access to PostgreSQL pointcloud is "for free".

But this is a clumsy detour (with unnecessary copying object structures) as compared to a direct access from the PostgreSQL database.

Such a direct access would probably even help to make this Point Cloud reader more generic.

But pls. prove me wrong since I don't know the internals of this QGIS extension.

hobu commented 3 years ago

But this is a clumsy detour (with unnecessary copying object structures) as compared to a direct access from the PostgreSQL database.

I disagree. There isn't tiff-specific or png-specific code in QGIS – it uses GDAL to abstract access to the data. For the same reason, it should use PDAL for access to many formats and grind that data into an organization suitable for rendering and operations of QGIS.

Additionally, the pgpointcloud organization isn't suitable for multiresolution rendering. It is blocks of full density data.

sfkeller commented 3 years ago

I see and agree then. I just want to prevent too specific requirements on the one hand, and to delimit the use cases on the other hand.

So, what about following use case: What I'm actually most interested, isn't a giffy 3D raster image, but a 3D mesh which can be analyzed. With this extension, are all or some of following steps possible (or is this beyond the scope of this proposal)?

Reduce density (sub sample) of point clouds
Compute normals
Convert point clouds to a mesh (surface)

mriedo commented 3 years ago

I agree with Stefan that these 3 functions ar very useful but don't know if it will already be in the first implementation more focused on visualisation. For the moment these 3 functionalities are available in cloudcompare and I use them on a daily basis.

To reduce density, there are many options from simple (random selection of less points) to more sophisticated (keypoints). The most useful one is to reduce to keypoints ... only keep characteristic points. In FME the pointcloudthinner offers only more simplified options (random) but still useful : http://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_Transformers/Transformers/pointcloudthinner.htm It's also useful to reduce density giving a target distance between points : reduce from 1000 pts/m2 to 100pts/m2 ... give a target distance of 10cm between points. Terrascan software has the most sophisticated options I know to reduce density.
Compute normals is also very useful. Normals can be used for visualisation and for analysis. Normals can be transformed to Dip and Dip orientation. Normals, dip, dip orientation are very useful to analyse road data (rut detection) and can be stored in LAS as extra attributes.
convert point clouds to a mesh is very useful. It's important to create DTM (Delauney) and is found in a lot of softwares handling lidar data. Recently with mobil lidar data, I found it also useful to be able to create mesh not only for terrain but more complex objects like tunnels, trees, ... for which a delauney doesn't work and you need a poisson reconstruction. This is for instance available in open source meshlab or free cloudcompare.

mriedo commented 3 years ago

But you can make incredible analysis directly with pointclouds without transforming to a mesh. I can send many exemples if you want but take a look at https://sitn.ne.ch/web/diffusion/videos/dip/dip.mp4 ... for instance slope and orientation can be visualized - analyzed on the pointcloud without needing to transform to a GRID or MESH.

wonder-sk commented 3 years ago

The functionality mentioned by @sfkeller is already available in PDAL as "filters":

point cloud density reduction - filters.sample and some others
computation of normals - filters.normal
mesh reconstruction - filters.delaunay and some others

In the initial stage we are going to focus entirely on the visualization - that's the work that the crowdfunding campaign is meant to cover.

At some point later (next year?) we would like to move focus on related topics, such as processing of point cloud data (by reusing existing PDAL algorithms and integrating them to QGIS).

mriedo commented 3 years ago

Great that some filters are available in pdal !

Here are some slides on the point cloud density reduction, very important topic when dealing with pointclouds https://sitn.ne.ch/web/diffusion/lidar/dedensification_lidar.pdf From my point of view, keypoints is a mandatory function

hobu commented 3 years ago

The capabilities of PointCloudThinner can be replicated by PDAL's filters.splitter, filters.chipper, filters.decimate or filters.sample, depending on how you want to thin data.

filters.poisson is an implementation of Kazdahn's method, with a few performance enhancements over its implementation in other places such as CloudCompare

filters.normal should do what you expect it to do.

writers.gltf and writers.ply can save mesh and faces from PDAL.

Other important PDAL capabilities:

filters.smrf for ground segmentation
filters.outlier for probabilistic noise point classification
filters.range for expression-defined point filtering
filters.hag_delaunay and filters.hag_dem for normalized height computation (canopy height model)
filters.covariancefeatures for point neighborhood moments
filters.assign for 2D point overlay and attribute assignment
filters.reprojection PROJ.org-based coordinate system transformation
filters.transformation application of affine matrices
filters.dem culling based on relation to a GDAL-readable surface

Lots more filters available at https://pdal.io/stages/filters.html too.

mriedo commented 3 years ago

Thanks @hobu ! impressive number of interesting functionalities available in pdal to handle pointclouds ! Amazing job on pdal.

It would be awesome to have them exposed some day in a future version of Qgis, in the Qgis graphical modeller ! Could be subject of a future crowdfunding campaign.

For the keypoint decimation, it would be awesome to have this in a future release of pdal. A lot of people ask me how they can do this, but I can't guide them to an open source software with this functionality (might be wrong).

kikislater commented 3 years ago

It would be awesome to have them exposed some day in a future version of Qgis, in the Qgis graphical modeller ! Could be subject of a future crowdfunding campaign.

Would be nice. At this time, it's already available : in experimental plugin, there is pdal plugin and you could run as processing in graphical modeller. You have to deal with pdal before as it requires a pipeline in json format. Pipelines are present in pdal documentation, you could build your own pipeline and put generated pipeline in pipelines subfolder of plugin. May be not the best as a native implementation but it works !

luipir commented 3 years ago

Would be nice. At this time, it's already available

@kikislater well... PDALtools was a kind of proof of concept to pack pdal execution with other processing backends. Unfortunally I developed it during change from pdal 1.8 to 2.x when there wheren't a simple way to use python binding to pdal with qgis 3.x/python3, so the executor just wraps command line. A really sub sub sub optimal integration!!! I strongly suggest to give support to the PDAL integration object of this thread instead to promote of use of PDATools. BTW Happy that it results useful to someone.

hobu commented 3 years ago

change from pdal 1.8 to 2.x when there wheren't a simple way to use python binding to pdal with qgis 3.x/python3,

PDAL's Python (and Java, C, and MATLAB) bindings are a separate package on PyPI now. It iterates independently from the main PDAL library. Its main purpose at the moment is to get data to and from Numpy for analysis.

luipir commented 3 years ago

change from pdal 1.8 to 2.x when there wheren't a simple way to use python binding to pdal with qgis 3.x/python3,

PDAL's Python (and Java, C, and MATLAB) bindings are a separate package on PyPI now. It iterates independently from the main PDAL library. Its main purpose at the moment is to get data to and from Numpy for analysis.

@hobu I know, that's the reason why my plugin can be considered obsolete... now should be refactored using direct execution of pipeline in a python script, resulting in a much more simpler architecture.

roya0045 commented 3 years ago

Should this be classified as implemented?

luipir commented 3 years ago

FYI and to leave here as a note: At least for the processing part. Just announced: https://github.com/CloudCompare/CloudComPy and https://github.com/tmontaigu/CloudCompare-PythonPlugin

roya0045 commented 3 years ago

I'm curious to know when and if the processing followup is planned. Currently the only way seem to be the way the plugin by @luipir, but usage is not user friendly, and I'm not sure if the errors I'm getting are caused by my pipeline or implementation limitations ( no offense Luigi, I just have no clue what I'm doing trying to convert ept to gdal with 'min').

wonder-sk commented 3 years ago

A follow-up is planned, yes. We have been trying to secure funding with a client - if that won't materialize, we would probably try to run another crowdfunding campaign...

qgis / QGIS-Enhancement-Proposals

Point Clouds in QGIS #194

QGIS Enhancement: Point Clouds in QGIS

Summary

Introduction

Design

Implementation

Indexing

2D Rendering

3D Rendering

References