Open nyalldawson opened 3 years ago
Much needed enhancement! I also strongly concur that the current way of deal with sublayers with magical delimiters is extremely fragile and error prone (a recent breakage in 3.18.0 occurred just because of that), and we definitively need a cleaner approach.
I can imagine that querySublayers() could accept arguments so that the user can specify which details it wants to know and/or a mode "get all details that are cheap to retrieve" and/or "approximate values are OK". Getting the feature count, or the geometry type (if a vector layer has mixed geometry type content, we often need to iterate over all its features to figure the actual geometries it holds), or geometry column name can be rather costly operations on some data sources, compared to just getting the layer name and type (raster, vector, ...)
I was wondering if we should have a provision for a future hiearchical presentation of layers. This would be really more for later usage, as I don't think many datasources can use that. One use case I have in mind is for KML, which is naturally hiearchical (also thinking the WMS capabilities also offer a hierarchical view. netCDF / HDF5 can also have a hiearchical structure). But at the GDAL / OGR level everything is flattened currently. But possibly if the GDAL data model was extended and the driver updated, this could flow into QGIS. Maybe add a QList<std::unique_ptr
@rouault
I can imagine that querySublayers() could accept arguments so that the user can specify which details it wants to know and/or a mode "get all details that are cheap to retrieve" and/or "approximate values are OK". Getting the feature count, or the geometry type (if a vector layer has mixed geometry type content, we often need to iterate over all its features to figure the actual geometries it holds), or geometry column name can be rather costly operations on some data sources, compared to just getting the layer name and type (raster, vector, ...)
Great point. I'll add a Flags argument which defaults to no flags, with optional flags available for forced resolving the geometry type and counting features in situations where we know this will potentially be expensive.
I was wondering if we should have a provision for a future hiearchical presentation of layers.
Also a good point. I'd suggest we could do this very simply by just adding a QStringList "hierarchy" or "path" member to QgsSublayerDetails, and then leave this up to the caller to decide how the want to present this information (as a single formatted string or via a tree view of the schema directories).
Handling qgis projects saved within geopackages would also be great!
QGIS Enhancement: Rework handling of multi-layer datasets
Date 2021/03/19
Author Nyall Dawson (@nyalldawson)
Contact nyall dot dawson at gmail dot com
Version QGIS 3.20 or 3.22
Summary
Many common spatial data formats support the storage of multiple layers of data. Furthermore, many of these formats allow for storing different types of layers within a single dataset, e.g. storing both raster and vector layers in a single file. Commonly encountered formats which support this include:
Currently, QGIS has poor support for these mixed layer-type data formats. Some of the issues in current versions include:
It is important to note that gpkg currently has generally quite good support for mixed formats in QGIS, but this is due to many hard-coded workarounds added for the geopackage format only, which can't be extended to other data formats.
Furthermore, the situation is complicated because the current QGIS API for handling sublayers inside a dataset is very old and extremely limited. The API is also very inefficient, e.g. it requires a raster or vector layer to be fully constructed before the full list of sublayers can be retrieved, only for this layer to be discarded and the actual desired sublayer opened, resulting in unnecessary work and network/disk usage. The API is also unfriendly for third party scripts and plugins to reuse for their own purposes.
Proposed Solution
This project consists of two components:
Proposed API
A new struct/data class will be created to provide a stable and structured way of storing sub layer details. (The current API uses poor quality hacks like returning a list of strings corresponding to sublayers, where each string consists of a mix of layer name, data type, description and other components all delimited by a special "!!::!!" separator). E.g.
The QgsProviderMetadata class will gain a new virtual method allowing the corresponding provider to query a URI and return a list of any valid sublayers contained in the dataset which that provider can handle.
Individual providers will be able to utilise whichever shortcuts apply to that specific provider for determining the list of sublayers they can open (WITHOUT the expense of creating a full QgsMapLayer object in order to do this). The method will initially be implemented for the OGR, GDAL and MDAL (mesh) data providers.
Lastly, the QgsProviderRegistry class will have a similar method for querying a URI for ALL registered dataproviders and collating a complete list of sublayers which can be handled by any data provider (e.,g. OGR, GDAL, MDAL, etc)
This API will be exposed to PyQGIS, allowing third party scripts and plugins a very easy to use, stable API for querying all valid sublayers in a dataset.
Proposed UI changes
The current separate dialogs which are used to prompt users for raster and vector sublayers to add from a dataset will be reworked into a single unified sublayer selection dialog, which uses the newly added APIs to show users a complete list of ALL valid sublayers in the file, regardless of the data provider or layer type.
The Browser panel code will be significantly reworked so that any file which contains multiple sublayers automatically shows as an expandable tree item, containing ALL the valid sublayers regardless of the data provider. This will be handled directly via the new API, and consequently will automatically apply for all data providers which can handle a particular file without hardcoded, provider-specific workarounds. (Furthermore it will also work correctly with any plugin-based data providers, providing them with a first-class integrated appearance!). This change will mean that all files only appear a single time in the browser panel, with users able to expand out the file to see ALL valid sublayers and then drag and drop these to add the layers as vector, raster, mesh, etc layers). Ultimately all data types will see the same first-rate browser user-experience as geopackage files have in current QGIS versions.