duckdb / duckdb_spatial

MIT License
489 stars 40 forks source link

Progress bar missing when reading 46 GB of GDB data #454

Open marklit opened 1 week ago

marklit commented 1 week ago

The ZIP is 8 GB and decompresses into ~50 files totalling 46 GB uncompressed. There is no progress bar at all. Even when trying to just count records.

$ wget -c https://nationaladdressdata.s3.amazonaws.com/NAD_r17.zip
$ unzip NAD_r17.zip
$ ~/duckdb # v1.1.3 19864453f7
COPY (
    SELECT   * EXCLUDE(AddrPoint,
                       DateUpdate,
                       Longitude,
                       Latitude,
                       Shape),
             DateUpdate::TIMESTAMP         AS DateUpdate,
             ST_POINT(Longitude, Latitude) AS geom
    FROM     ST_READ('NAD_r17.gdb/a00000009.gdbtable')
    ORDER BY HILBERT_ENCODE([Longitude,
                             Latitude]::DOUBLE[2])
) TO 'NAD_r17.pq' (
  FORMAT            'PARQUET',
  CODEC             'ZSTD',
  COMPRESSION_LEVEL 22,
  ROW_GROUP_SIZE    15000);