duckdb / duckdb_spatial

MIT License
475 stars 35 forks source link

Core dump when create a geometry column from a wkb #305

Closed florentfgrs closed 4 months ago

florentfgrs commented 6 months ago

DuckDB crashes when I try to create a geometry column from a binary with Overtures Maps data. I'm work with DuckDB 0.10.2 (same bug with 0.10.1)

With python :

import duckdb 

con = duckdb.connect("test1.db")
con.sql(" INSTALL spatial ; INSTALL httpfs  ; LOAD spatial ; LOAD httpfs ; ")

sql_query = ("SELECT id, "
"geometry "
"FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', "
"filename=true, "
"hive_partitioning=1) "
"LIMIT 5; ")

con.sql(sql_query).show()

No problem, we have a result: the geometry is a binary object. As indicated in the overtures maps documentation, use ST_GeomFromWKB() to create the geometry column.

Résult :

image

import duckdb 

con = duckdb.connect("test2.db")
con.sql(" INSTALL spatial ; INSTALL httpfs  ; LOAD spatial ; LOAD httpfs ; ")

sql_query = ("SELECT id, "
"ST_GeomFromWKB(geometry) as geom "
"FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', "
"filename=true, "
"hive_partitioning=1) "
"LIMIT 5; ")

con.sql(sql_query).show()

Résult :

Segmentation fault (core dumped)

image

Maxxen commented 6 months ago

Hi! Thanks for reporting this issue. I can't reproduce this from the CLI, but I'll take a look with python tomorrow! My guess is that there is something with the string conversion happening in the python .show() code. Could you share some more information about your system? e.g. OS, CPU architecture, python version?

florentfgrs commented 6 months ago

I also have the cli bug.

Info system : OS : Kubuntu 22.04 CPU : Intel i7-11800H Python version : 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]

florentfgrs commented 6 months ago

I tested with another computer and I don't have the problem...

OS : Windows 11 CPU : Intel i9-11900K Python version : Python 3.8.5

Maxxen commented 6 months ago

So I have a theory of what's going wrong here but I need to do some refactoring to fix it. Ill keep you posted once I get it merged.

florentfgrs commented 6 months ago

Thanks

florentfgrs commented 5 months ago

Hello @Maxxen

has the correction been made?

Maxxen commented 5 months ago

@florentfgrs Hi! Sorry, yes, I think this should be fixed on the nightly build, you can get it for v0.10.2 by executing:

FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org' ;
florentfgrs commented 5 months ago

@Maxxen Oh thank you very much, it solves my problem!

In [2]: import duckdb

In [3]: con = duckdb.connect("test3.db")

In [4]: con.sql("FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org' ; INSTALL httpfs  ; LOAD spatial ; LOAD httpfs ; ")
100% ▕████████████████████████████████████████████████████████████▏

In [5]: sql_query = ("SELECT id, "
   ...: "ST_GeomFromWKB(geometry) as geom "
   ...: "FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', "
   ...: "filename=true, "
   ...: "hive_partitioning=1) "
   ...: "LIMIT 5; ")

In [6]: con.sql(sql_query).show()
100% ▕████████████████████████████████████████████████████████████▏
┌──────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│          id          │                                                                                                      geom                                                                                                       │
│       varchar        │                                                                                                    geometry                                                                                                     │
├──────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 08bf2a40219b0fff02…  │ POLYGON ((-167.3999539 -83.6500135, -167.3999133 -83.6500116, -167.3998804 -83.6500083, -167.3998582 -83.6500038, -167.3998506 -83.6499988, -167.3998584 -83.6499938, -167.3998807 -83.6499893, -167.3999152 …  │
│ 08bf35ad6a05afff02…  │ POLYGON ((-136.8028948 -74.7669439, -136.8030749 -74.7670273, -136.8020661 -74.7671777, -136.801886 -74.7670942, -136.8028948 -74.7669439))                                                                     │
│ 08bf35ad6a058fff02…  │ POLYGON ((-136.8033409 -74.7667304, -136.802516 -74.7668605, -136.8023483 -74.7667871, -136.8031732 -74.766657, -136.8033409 -74.7667304))                                                                      │
│ 08bf35ad6a04efff02…  │ POLYGON ((-136.8030898 -74.7660076, -136.8025614 -74.7659821, -136.8026186 -74.7659004, -136.8031471 -74.765926, -136.8030898 -74.7660076))                                                                     │
│ 08bf35ad6a04afff02…  │ POLYGON ((-136.8020881 -74.7661436, -136.8014847 -74.7660121, -136.8017343 -74.765933, -136.8023377 -74.7660645, -136.8020881 -74.7661436))                                                                     │
└──────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Once version 0.10.3 is released, can I just do INSTALL spatial again?

Maxxen commented 5 months ago

Great! Yes, once v0.10.3 is released (scheduled in two weeks) you wont need the nightly channel again

danabauer commented 5 months ago

Excellent! Thank you for fixing this!