duckdb / duckdb_spatial

MIT License
475 stars 35 forks source link

Include option for `preserve_fid` #213

Open scottstanie opened 10 months ago

scottstanie commented 10 months ago

I'd like to read in layers which had an id and use that as a column in the table. ogr2ogr has this option as preserve_fid

I'm seeing now that there's always INCLUDE_FID=NO https://github.com/duckdb/duckdb_spatial/blob/5cde7a2abead24e8fc81a33e8ee321188f4df2de/spatial/src/spatial/gdal/functions/st_read.cpp#L277

Is it possible to add this as an option to ST_Read?

Maxxen commented 10 months ago

Hi! Thanks for filing this issue! If I recall correctly I disabled it due to difficulties with projection pushdown as the column order changes whether or not it's present. AFAIK the FID is always sequential (?), and DuckDB preserves insertion order so if you would import the data into a table could maybe use the rowid pseudocolumn instead.

I want to refactor st_read at some point in the future anyway so I'll try to revisit this issue at that point.

scottstanie commented 10 months ago

Thanks for the explanations! Unfortunately the FID doesn't always increment up from 1 sequentially- I came across this because I was converting a shapefile to GeoJSON while also running a SQL filter:

ogr2ogr -f GeoJSON -preserve_fid -dialect sqlite -sql "SELECT f.fid .... WHERE ..."

Since relying on the FIDs have caused annoyance and confusion elsewhere in my processing, I should be able to work around this for my current use case (I'll change the FID to an attribute so it shows up normally in the DuckDB table).