cloudera-labs / envelope

Build configuration-driven ETL pipelines on Apache Spark
Apache License 2.0
157 stars 89 forks source link

Aside from Kudu, what makes this project require CDH? #16

Closed DonDebonair closed 7 years ago

DonDebonair commented 7 years ago

I think this project is a great idea! There are many simple, common ETL-like tasks, that you'd not want to write a complete Spark job for, but instead just configure what you want. Envelope seems to do just that. It seems, however, that this should not be limited to clusters running CDH. I can imagine this running on Hortonworks or MapR clusters just fine, except for the Kudu parts, that is.

Care to comment?

jeremybeard commented 7 years ago

I'm glad you think this project could be useful!

I don't see any fundamental reason why Envelope wouldn't work on other distributions as long as the versions of the dependencies were compatible with each other and with the cluster. The presence of the Kudu integration shouldn't break pipelines on clusters that don't have Kudu either, provided the pipeline doesn't try to access Kudu of course!

jeremybeard commented 7 years ago

If you've got any more questions please drop a line on the Cloudera Community forum: http://community.cloudera.com/t5/Cloudera-Labs/bd-p/ClouderaLabs

Thanks!