duckdb / duckdb_spatial

MIT License
489 stars 40 forks source link

Floating point exception with local PQ Overture files #368

Closed marklit closed 3 months ago

marklit commented 4 months ago

Sourcing from Overture's S3 release for July doesn't have any issues:

COPY (
    SELECT h3_cell_to_boundary_wkt(
                h3_latlng_to_cell(bbox.ymax, bbox.xmax, 5))::geometry geom,
            COUNT(*)
    FROM read_parquet('s3://overturemaps-us-west-2/release/2024-07-22.0/theme=addresses/type=address/*.parquet')
    WHERE country = 'CA'
    group by 1
) TO 'addresses.ca.gpkg'
    WITH (FORMAT GDAL,
          DRIVER 'GPKG',
          LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES');

But running the same on their files locally raises a floating point exception:

$ aws s3 --no-sign-request sync \
    s3://overturemaps-us-west-2/release/2024-07-22.0/theme=addresses/type=address \
    ./addresses
$ cd addresses
$ ~/duckdb # v1.0.0 1f98600c2c
COPY (
    SELECT h3_cell_to_boundary_wkt(
                h3_latlng_to_cell(bbox.ymax, bbox.xmax, 5))::geometry geom,
            COUNT(*)
    FROM read_parquet('*.parquet')
    WHERE country = 'CA'
    group by 1
) TO 'addresses.ca.2.gpkg'
    WITH (FORMAT GDAL,
          DRIVER 'GPKG',
          LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES');
Floating point exception
marklit commented 4 months ago

This might also relate to another JSON decoding issue over S3 vs local https://gist.github.com/marklit/3a0d57a0558a80cd387e5da7be4dce96#gistcomment-5129094

marklit commented 3 months ago

I'm now seeing the issue on S3 with August's data. I'm using v1.0.0 1f98600c2c

COUNTRY=US

echo "COPY (
          SELECT h3_cell_to_boundary_wkt(
                      h3_latlng_to_cell(bbox.ymax, bbox.xmax, 5))::geometry geom,
                  COUNT(*)
          FROM read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=addresses/type=address/*.parquet')
          WHERE country = '$COUNTRY'
          group by 1
      ) TO 'addresses.2024.08.$COUNTRY.gpkg'
          WITH (FORMAT GDAL,
                DRIVER 'GPKG',
                LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES')" | ~/duckdb
 13% ▕███████▊                                                    ▏ Floating point exception
marklit commented 3 months ago

The issue is probably happening outside of the Spatial extension, I raised this instead: https://github.com/duckdb/duckdb/issues/13504