Closed enr1c091 closed 4 years ago
You should upload the sales sample data to aws-etl-orchestrator-demo-raw-data/sales and marketing sample data to aws-etl-orchestrator-demo-raw-data/marketing
For example: aws s3 ls s3://aws-etl-orchestrator-demo-raw-data --region ap-northeast-1 --profile us-east-1 --recursive 2019-12-26 17:39:42 0 marketing/ 2019-12-26 17:43:36 151746 marketing/MarketingData_QuickSightSample.csv 2019-12-26 17:42:55 0 sales/ 2019-12-26 17:43:51 2002910 sales/SalesPipeline_QuickSightSample.csv
Like @liangruibupt pointed out. Project readme updated with instructions for copying the datasets.
Hi,
I am running this sample and for some reason that I can't figure out why, the process_marketing_data.py isn't writing the output file to S3 and the Count: log in CWL returns 0. Therefore, the Join step fails since it can't infer schema to the parquet file.