duckdb / duckdb_delta

DuckDB extension for Delta Lake
MIT License
140 stars 16 forks source link

Delta read on GCS seems to point to S3 instead #97

Open junhl opened 2 months ago

junhl commented 2 months ago

I have delta table on GCS and tried reading it with Delta extension of DuckDB

Using python library, version 1.1.0:

with duckdb.connect() as con:
    con.execute(
        """
        CREATE SECRET (
          TYPE GCS,
          KEY_ID 'SOME_KEY',
          SECRET 'SOME SECRET'
        )
        """
    )
    results = con.execute("select col1, col2 from delta_scan('gs://some-bucket/some-table')").fetch_df()

This leads to

duckdb.duckdb.IOException: IO Error: Hit DeltaKernel FFI error (from: While trying to read from delta table: 'gs://some-bucket/some-table/'): Hit error: 8 (ObjectStoreError) with message (Error interacting with object store: Generic S3 error: Error after 10 retries in 4.432062125s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://s3..amazonaws.com/some-bucket/some-table/_delta_log/_last_checkpoint))

The error message indicates it tried as S3 instead of GCS, which wouldn't work.

stele-and-rivers-001 commented 2 months ago

@junhl I am experiencing this issue as well - looks like the Delta extension in v1.1.0 is accidentally pointing to S3 instead of GCS.

NOTE: downgrading to version 1.0.0 fixed the problem for me.

junhl commented 2 months ago

Thanks @stele-and-rivers-001 - I also confirm it is working in 1.0.0 - looks like there was an accidental change as you said in 1.1.0.

Marcus-Holanda777 commented 3 weeks ago

I faced the same problem in version 1.1.2, is there any update for this problem ?