databendlabs / databend

𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.87k stars 752 forks source link

tpch: load amazon-redshift sample data error #12891

Open BohuTANG opened 1 year ago

BohuTANG commented 1 year ago

Summary

 CREATE TABLE IF NOT EXISTS nation (
    n_nationkey INTEGER not null,
    n_name STRING not null,
    n_regionkey INTEGER not null,
    n_comment STRING
  );

COPY INTO nation FROM 's3://redshift-downloads/TPC-H/2.18/10GB/nation.tbl' file_format =(TYPE = CSV, field_delimiter = '|');

Snowflake is ok. Databend returns:

copy INTO nation
FROM
  's3://redshift-downloads/TPC-H/2.18/10GB/nation.tbl' file_format =(TYPE = CSV, field_delimiter = '|')

error happens after fetched 0 rows: APIError: PageError with 3002: PermissionDenied (persistent) at stat, context: { service: s3, path: TPC-H/2.18/10GB/nation.tbl } => no valid credential found, please check configuration or try again

The tpch raw data note is here: https://github.com/awslabs/amazon-redshift-utils/blob/master/src/CloudDataWarehouseBenchmark/Cloud-DWB-Derived-from-TPCH/10GB/ddl.sql#L114

Xuanwo commented 1 year ago

Please add allow_anonymous='true' to allow sending query without credentials.

For example:

copy INTO nation
FROM
  's3://redshift-downloads/TPC-H/2.18/10GB/nation.tbl' CONNECTION = (allow_anonymous='true' region='us-east-1') file_format =(TYPE = CSV field_delimiter = '|');