Shopify / camus

Kafka->HDFS pipeline from LInkedIn. It is a mapreduce job that does distributed data loads out of Kafka.
7 stars 4 forks source link

Deduplication task #111

Closed olessia closed 6 years ago

olessia commented 6 years ago

Create the deduplication task in Camus that runs speedboat deduplicator.