geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
258 stars 22 forks source link

Show append support in list_drivers #375

Open martinfleis opened 3 months ago

martinfleis commented 3 months ago

Fiona shows which drivers support appending in fiona.supported_drivers. Is there a way for pyogrio to do the same?

In [1]: import fiona

In [2]: fiona.supported_drivers
Out[4]: 
{'DXF': 'rw',
 'CSV': 'raw',
 'OpenFileGDB': 'raw',
 'ESRIJSON': 'r',
 'ESRI Shapefile': 'raw',
 'FlatGeobuf': 'raw',
 'GeoJSON': 'raw',
 'GeoJSONSeq': 'raw',
 'GPKG': 'raw',
 'GML': 'rw',
 'OGR_GMT': 'rw',
 'GPX': 'rw',
 'Idrisi': 'r',
 'MapInfo File': 'raw',
 'DGN': 'raw',
 'PCIDSK': 'raw',
 'OGR_PDS': 'r',
 'S57': 'r',
 'SQLite': 'raw',
 'TopoJSON': 'r'}
jorisvandenbossche commented 3 months ago

I am not directly sure if GDAL provides a way to query that information. In Fiona, those values are hardcoded in their drvsupport.py

In pyogrio, for listing the drivers that support writing, we use the DCAP_CREATE capability:

https://github.com/geopandas/pyogrio/blob/dff672c7cd4d1a62521d7e9a24d4b65000b6b26a/pyogrio/_ogr.pyx#L103-L129

But that doesn't say anything about append support.

Looking at the code of ogr2ogr, it seems they first try to open in append mode, and if that fails open in normal mode, and if that works raise a specific error message about not being able to open an existing data source. So they also only check that while running the write operation, and don't query some information somewhere.

This led me to realize that our current error message for failing append is not really great. We have this part updating the error message from GDAL:

https://github.com/geopandas/pyogrio/blob/dff672c7cd4d1a62521d7e9a24d4b65000b6b26a/pyogrio/_io.pyx#L165-L171

But that message is not very helpful when the reason for this failure is that append is not supported.

In [4]: df = geopandas.GeoDataFrame({"col": np.random.randn(10), "geometry": geopandas.points_from_xy(range(10), range(10))})

In [5]: pyogrio.write_dataframe(df, "test.gml")   # works fine

In [6]: pyogrio.write_dataframe(df, "test.gml", append=True)
...
File pyogrio/_io.pyx:169, in pyogrio._io.ogr_open()

DataSourceError: 'test.gml' not recognized as a supported file format. It might help to specify the correct driver explicitly by prefixing the file path with '<DRIVER>:', e.g. 'CSV:path'.