aws-samples / amazon-redshift-query-patterns-and-optimizations

In this workshop you will launch an Amazon Redshift cluster in your AWS account and load sample data ~ 100GB using TPCH dataset. You will learn query patterns that affects Redshift performance and how to optimize them. In this lab we will also provide a framework to simulate workload management (WLM) queue and run concurrent queries in regular interval and measure performance metrics- query throughput, query duration etc. We will also provide some use cases for Redshift spectrum to query data from s3 in columnar format such as Parquet.
MIT No Attribution
25 stars 22 forks source link

TPCH dataset #1

Open mashda1 opened 3 years ago

mashda1 commented 3 years ago

Hello where to get ~100GB TPCH dataset ? Thanks

saunakc commented 3 years ago

The script to load 100g data is specified in https://github.com/aws-samples/amazon-redshift-query-patterns-and-optimizations/blob/master/sql_scripts/table_load.sql

rajeshkbabbar commented 6 months ago

Hey, Is this still working? I am not able to load files from S3. Thanks.