Shopify / camus

Kafka->HDFS pipeline from LInkedIn. It is a mapreduce job that does distributed data loads out of Kafka.
7 stars 4 forks source link

Better upload sanity check for gcs upload #118

Closed olessia closed 6 years ago

olessia commented 6 years ago

List files before uploading, make sure all those files are in gcs. Should save the false positive failures when camus is writing to the folder at the time of upload.