Project to learn Data Engineering from: https://github.com/DataTalksClub/data-engineering-zoomcamp
0
stars
0
forks
source link
DATAENG-1: Have a successful pipeline run for populate yellow_cab tripdata sets in the Populate Tripdata Pipeline #1
Closed
nimbly-dev closed 6 days ago
Currently, the following parameters below fails. Fix the pipeline for data population of yellow_cab datasets
Endpoint: http://localhost:6789/api/pipeline_schedules/5/pipeline_runs/51372ce952da4ce4bc70cbe37eda0ff2
Parameters:
{ "pipeline_run": { "variables": { "dev_limit_rows" : -1, "end_month": 12, "end_year": 2022, "start_month": 1, "start_year": 2021, "pipeline_run_name": "populate_yellowtripdata_2021_2022", "spark_mode" : "cluster", "tripdata_type": "yellow_cab_tripdata", "data_loss_threshold": "very_strict", "overwrite_enabled" : true } } }
If done, attached the populated lakehouse, stage, and production DB. Together with the successful pipeline build