OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.77k stars 2.5k forks source link

Feature Request for OGRLayer: Add functions considering SetAttributeFilter() in GetExtent() #5372

Open tschmetzer opened 2 years ago

tschmetzer commented 2 years ago

This is to request and discuss an additional modified version of GetExtent() and GetFeatureCount().

I would suggest to call them GetFilteredExtent() and GetFilteredFeatureCount().

Especially the GetFilteredExtent() functionality would be very valuable to have. The handling of large GPKG layer files of several or dozens of GB would benefit from a tailored function.

Background: Currently the only way to get an extent or feature count on a subset is to set a subset via the OGRLayer::SetAttributeFilter()and then iterate over the whole subset doing a subsequent GetNextFeature(). These functions will of course only make sense if an existing spatial index of the layer's features can be used to highly speed up the extent calculation and feature count or any other optimization that can be found.

Issues like this could be speed up: https://github.com/qgis/QGIS/issues/46393

jratike80 commented 2 years ago

From GeoPackage and SpatiaLite the extent is already available very fast from the rtree index.

select min(minx) from rtree_kunta_geom
where id in
(select id from kunta limit 100)

Example using ogrinfo ogrinfo -sql "select min(minx) from rtree_rakennus_geom where id in (select fid from rakennus limit 10000)" mtkmaasto.gpkg

Similarly count ogrinfo -sql "select count(*) from rakennus where sijaintitarkkuus=3000" mtkmaasto.gpkg

rouault commented 2 years ago

The GetFeatureCount() method already takes into account attribute and spatial filters in the GPKG driver. If an attribute filter is set, GetFeatureCount() efficiency will depend on the existence of an index on the filtered column(s). It is true that GetExtent() doesn't take into account attribute or spatial filters