astronomer / astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
https://astro-sdk-python.rtfd.io/
Apache License 2.0
354 stars 45 forks source link

Investigate - 10MB file transfer from gcs to redshift is taking too much time #799

Open utkarsharma2 opened 2 years ago

utkarsharma2 commented 2 years ago

Describe the bug Currently, the data transfer from GCS to Redshift is taking too much time. For a 100KB file time taken was 6min For the 10Mb file Process was getting killed.

Version

To Reproduce Steps to reproduce the behavior:

  1. Update the conf.json file with the datasets of size 10MB/100kb and database to redshift
  2. run run.sh in terminal

Expected behavior Data should load in a reasonable time and the process shouldn't be killed.

kaxil commented 2 years ago

Is it still the case @utkarsharma2 ?