duckdb / duckdb_spatial

MIT License
475 stars 35 forks source link

Serialization Error: Trying to read past end of buffer #353

Closed jmontrose closed 4 months ago

jmontrose commented 4 months ago

I get this when I try to get too fancy and use a nested query, so forgive me if this is user error. I have a two-state workaround that's fine, but I'm curious about the error.

I'm in love with DuckDB, thanks!

This fails:

copy (
    select 
        f.id, f.type, f.subtype, f.class, ST_GeomFromWkb(f.geometry) AS geometry
    from  
        (select 
            ST_GeomFromWkb(geometry) AS geometry
        from 
            read_parquet('s3://overturemaps-us-west-2/release/2024-06-13-beta.1/theme=divisions/type=division_area/*', hive_partitioning=1) 
        where 
            country = 'US' and 
            region = 'US-OR' and 
            subtype = 'region'
        limit 1) as r,
        read_parquet('s3://overturemaps-us-west-2/release/2024-06-13-beta.0/theme=base/type=water/*', hive_partitioning=1) as f
    where 
        bbox.xmin >= -125.48978996503888 and
        bbox.xmax <= -115.8454962385643 and
        bbox.ymin >= 41.310142574181384 and
        bbox.ymax <= 46.75110084463407 and
        ST_Intersects(
            ST_GeomFromWkb(f.geometry), 
            ST_GeomFromWkb(r.geometry))
) to 'broken.fgb'
with (format gdal, driver 'FlatGeobuf');

This succeeds (previously saved the output):

copy (
    select 
        f.id, f.type, f.subtype, f.class, ST_GeomFromWkb(f.geometry) AS geometry
    from read_parquet('oregon_region.parquet') as r,
        read_parquet('s3://overturemaps-us-west-2/release/2024-06-13-beta.0/theme=base/type=water/*', hive_partitioning=1) as f
    where 
        bbox.xmin >= -125.48978996503888 and
        bbox.xmax <= -115.8454962385643 and
        bbox.ymin >= 41.310142574181384 and
        bbox.ymax <= 46.75110084463407 and
        ST_Intersects(
            ST_GeomFromWkb(f.geometry), 
            ST_GeomFromWkb(r.geometry))
) to 'intermediate.fgb'
with (format gdal, driver 'FlatGeobuf');

Both queries start with:

INSTALL spatial;
INSTALL httpfs;
LOAD spatial;
LOAD httpfs;
SET s3_region='us-west-2';
Maxxen commented 4 months ago

I think the issue is here

    select 
        f.id, f.type, f.subtype, f.class, ST_GeomFromWkb(f.geometry) AS geometry
    from  
        (select 
            ST_GeomFromWkb(geometry) AS geometry

You're converting from WKB twice, as GEOMETRY is implicitly castable from BLOB. I think this has been changed in the latest dev build so that geometry won't implicitly cast to blobs anymore and this will thus result in a binder exception.

jmontrose commented 4 months ago

Ouch! That did it, thank you for the quick response 🙏