duckdb / duckdb_spatial

MIT License
426 stars 32 forks source link

Understanding duckdb WKB geometry type (not readable by shapely) #188

Open ncclementi opened 8 months ago

ncclementi commented 8 months ago

On Ibis we are working on supporting the duckdb geospatial spatial extension (see and we would like to understand better what's coming out of geometry types when we select from a column.

The docs say it's WKB but when we try to convert to a shapely object, we get an error. But when we call st_aswkb then we can get it to work, but it doesn’t seem like this should be necessary.

Here is a reproducer that @cpcloud put together that shows this:

Maxxen commented 8 months ago


DuckDB's internal representation of the geometry type is not actually WKB, in fact it is very similar to PostGIS geometry representation, although with a different "header". However I can't promise that our geometry representation is fully stable yet. I still want to extend it (soon!) so that we can store for example multi-dimensional geometries (Z/M), SRID and/or other properties like whether a geometry is valid/solid/geodetic. In theory we should be able to do this in a backwards compatible way but again, no promises.

Here's some excerpts from the code that hopefully provides some insight into how the format works.

The header: (4 bytes)

The "properties" (1 byte)

The full geometry. (always a multiple of 8 bytes) The comments here are slightly outdated, if the geometry is non-empty and not a point there is a 8xF32 bounding box serialized after the "header" and the 4 byte padding.

Full Deserialization code