duckdb / duckdb_aws

MIT License
40 stars 14 forks source link

Can't read from S3 bucket (HTTP 403) while using aws cli with same credentials works fine #42

Open petervalencic opened 5 months ago

petervalencic commented 5 months ago

Hi, I have set-up duckdb in memory with Dbeaver and executed:

INSTALL httpfs;
LOAD httpfs;
INSTALL aws;
LOAD aws;
CALL load_aws_credentials('default');

when trying to fetch the data as: select * from read_parquet('s3://useast1-...../etl_framework/finalparquetfiles/fct_nano_data_parq/CC_Data/data_date=2024-04-01/hour=10/000165_0');

got:

.....com/etl_framework/finalparquetfiles/fct_nano_data_parq/CC_Data/data_date%3D2024-04-01/hour%3D10/000165_0' (HTTP 403)

The stack trace is:

org.jkiss.dbeaver.model.sql.DBSQLException: SQL Error: java.sql.SQLException: HTTP Error: HTTP GET error on 'https://.....dataload-perc.s3.amazonaws.com/etl_framework/finalparquetfiles/fct_nano_data_parq/CC_Data/data_date%3D2024-04-01/hour%3D10/000165_0' (HTTP 403)
    at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.executeStatement(JDBCStatementImpl.java:133)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeStatement(SQLQueryJob.java:614)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.lambda$2(SQLQueryJob.java:505)
    at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:194)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeSingleQuery(SQLQueryJob.java:524)
    at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.extractData(SQLQueryJob.java:976)
    at org.jkiss.dbeaver.ui.editors.sql.SQLEditor$QueryResultsContainer.readData(SQLEditor.java:4155)
    at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.lambda$0(ResultSetJobDataRead.java:123)
    at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:194)
    at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.run(ResultSetJobDataRead.java:121)
    at org.jkiss.dbeaver.ui.controls.resultset.ResultSetViewer$ResultSetDataPumpJob.run(ResultSetViewer.java:5152)
    at org.jkiss.dbeaver.model.runtime.AbstractJob.run(AbstractJob.java:115)
    at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)
Caused by: java.sql.SQLException: java.sql.SQLException: HTTP Error: HTTP GET error on 'https://u...dataload-perc.s3.amazonaws.com/etl_framework/finalparquetfiles/fct_nano_data_parq/CC_Data/data_date%3D2024-04-01/hour%3D10/000165_0' (HTTP 403)
    at org.duckdb.DuckDBPreparedStatement.prepare(DuckDBPreparedStatement.java:122)
    at org.duckdb.DuckDBPreparedStatement.execute(DuckDBPreparedStatement.java:196)
    at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.execute(JDBCStatementImpl.java:330)
    at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.executeStatement(JDBCStatementImpl.java:131)
    ... 12 more
Caused by: java.sql.SQLException: HTTP Error: HTTP GET error on 'https://....dataload-perc.s3.amazonaws.com/etl_framework/finalparquetfiles/fct_nano_data_parq/CC_Data/data_date%3D2024-04-01/hour%3D10/000165_0' (HTTP 403)
    at org.duckdb.DuckDBNative.duckdb_jdbc_prepare(Native Method)
    at org.duckdb.DuckDBPreparedStatement.prepare(DuckDBPreparedStatement.java:116)
    ... 15 more

from what I have noticed the URL is converted to https like: https://useast1-......ataload-perc.s3.amazonaws.com/etl_framework/finalparquetfiles/fct_nano_data_parq/CC_Data/data_date%3D2024-04-01/hour%3D10/000165_0

this is the response if I try to invoke the URL in browser: image

from the command line I'm able to download the same file: image

what can be wrong?

BR Peter

editicalu commented 3 months ago

Looks similar to https://github.com/duckdb/duckdb_aws/issues/31