qgis / QGIS

QGIS is a free, open source, cross platform (lin/win/mac) geographical information system (GIS)
https://qgis.org
GNU General Public License v2.0
10.55k stars 2.99k forks source link

Memory layer with "fid" column can cause fid collision when writing to GeoPackage #55896

Open jfbourdon opened 9 months ago

jfbourdon commented 9 months ago

What is the bug or the crash?

Merging multiple GeoPackage using native:mergevectorlayers and saving to a memory layer can cause FID collision later on when passing this layer to an other processing tool that then save to a new GeoPackage. This issue is that the fid column then may contain duplicate values and that this column is used by default by processing tools as the unique ID when saving to the GeoPackage format.

It must be noted that not all processing tools react the same way when saving to GeoPackage a memory layer with a non-unique fid column:

For all these cases, I think that the correct behavior should be to create a new GPKG with the fid column being renumerated from 1 and to emit a warning saying that IDs has been reset due to collisions. This would be only when there is actual fid collision detected. When there is no collision, the original IDs would be kept, as this is already the case.

Steps to reproduce the issue

  1. Download this test dataset of two GPKG
  2. Run the following code in the Python console
import tempfile
tempdir = tempfile.gettempdir()

# Source layers (download link above)
layer1 = "C:/temp/lyr1.gpkg"  
layer2 = "C:/temp/lyr2.gpkg"

# Display IDs of the source layers entities
# See the incoming collision with two 80
[(f.id(), f["fid"]) for f in QgsVectorLayer(layer1).getFeatures()]  # IDs: [(3, 3), (80, 80)]
[(f.id(), f["fid"]) for f in QgsVectorLayer(layer2).getFeatures()]  # IDs: [(4, 4), (80, 80)]

# Merging both source layers directly into a GeoPackage
# See that no entities are lost and that new IDs are generated
outpath_merge = os.path.join(tempdir, "lyr_merge.gpkg")
processing.run("native:mergevectorlayers", {
    'LAYERS':[layer1, layer2],
    'OUTPUT':outpath_merge
    })
[(f.id(), f["fid"]) for f in QgsVectorLayer(outpath_merge).getFeatures()]  # IDs: [(1, 1), (2, 2), (3, 3), (4, 4)]

# Merging both source layers into a memory layer
# See that no entities are lost and that new IDs are generated
lyr_merged = processing.run("native:mergevectorlayers", {
    'LAYERS':[layer1, layer2],
    'OUTPUT':'TEMPORARY_OUTPUT'
    })["OUTPUT"]
[(f.id(), f["fid"]) for f in lyr_merged.getFeatures()]  # IDs: [(1, 3), (2, 80), (3, 4), (4, 80)]

# Saving the merged memory layer into a GeoPackage using QgsVectorFileWriter
# One entity is lost, QgsVectorFileWriter returns code 7 (ErrFeatureWriteFailed)
# with an OGR error message about ID collision
outpath_fileWriter = os.path.join(tempdir, "lyr_fileWriter.gpkg")
save_options = QgsVectorFileWriter.SaveVectorOptions()
QgsVectorFileWriter.writeAsVectorFormatV3(lyr_merged, outpath_fileWriter, QgsCoordinateTransformContext(), save_options)
[(f.id(), f["fid"]) for f in QgsVectorLayer(outpath_fileWriter).getFeatures()]  # IDs: [(3, 3), (4, 4), (80, 80)]

# Saving the merged memory layer into a GeoPackage but doing
# an other processing before (here the Reproject tool)
# One entity is lost but there is no message to the user about the ID collision
outpath_reproject = os.path.join(tempdir, "lyr_reproject.gpkg")
processing.run("native:reprojectlayer", {'INPUT':lyr_merged, 'TARGET_CRS':lyr_merged.crs(), 'OUTPUT':outpath_reproject})
[(f.id(), f["fid"]) for f in QgsVectorLayer(outpath_reproject).getFeatures()]  # IDs: [(3, 3), (4, 4), (80, 80)]

# Saving the merged memory layer into a GeoPackage but doing
# an other processing before (here the Buffer tool)
# The processing crash with an error ("Could not write feature into OUTPUT")
outpath_buffer = os.path.join(tempdir, "lyr_buffer.gpkg")
processing.run("native:buffer", {'INPUT':lyr_merged,'DISTANCE':10,'OUTPUT':outpath_buffer})

Versions

QGIS 3.28.15 and 3.34.3 GDAL 3.8.3 PROJ 9.3.1 GEOS 3.12.1-CAPI-1.18.1

Windows 10 Entreprise 22H2

Supported QGIS version

New profile

Additional context

No response

jfbourdon commented 5 months ago

Partial duplicate of https://github.com/qgis/QGIS/issues/36378, https://github.com/qgis/QGIS/issues/38855, https://github.com/qgis/QGIS/issues/41156.