snowplow / dataflow-runner

Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
http://snowplowanalytics.com
19 stars 8 forks source link

Reduce repetition in jobflow step logging #39

Closed alexanderdean closed 6 years ago

alexanderdean commented 6 years ago

There is huge repetition of the state of the jobflow steps in the logs. It makes it extremely hard to see what is going on.

time="2018-01-05T17:40:28Z" level=info msg="Step 'Hadoop Shred: shred enriched events for Redshift' with id 's-xxx' completed successfully" time="2018-01-05T17:40:43Z" level=info msg="Step 'S3DistCp Step: Enriched events -> staging S3' with id 's-xxx' completed successfully" time="2018-01-05T17:40:44Z" level=info msg="Step 'Hadoop Shred: shred enriched events for Redshift' with id 's-xxx' completed successfully" time="2018-01-05T17:40:59Z" level=info msg="Step 'S3DistCp Step: Enriched events -> staging S3' with id 's-xxx' completed successfully" time="2018-01-05T17:40:59Z" level=info msg="Step 'Hadoop Shred: shred enriched events for Redshift' with id 's-xxx' completed successfully" time="2018-01-05T17:41:15Z" level=info msg="Step 'S3DistCp Step: Enriched events -> staging S3' with id 's-xxx' completed successfully" time="2018-01-05T17:41:15Z" level=info msg="Step 'Hadoop Shred: shred enriched events for Redshift' with id 's-xxx' completed successfully" time="2018-01-05T17:41:31Z" level=info msg="Step 'S3DistCp Step: Enriched events -> staging S3' with id 's-xxx' completed successfully" 
alexanderdean commented 6 years ago

Let's keep this in for debugging.