snowplow / dataflow-runner

Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
http://snowplowanalytics.com
19 stars 8 forks source link

Add support for specifying applications to install on cluster #6

Closed alexanderdean closed 7 years ago

jbeemster commented 7 years ago

Are these applications that are usually installed using Jobflow Steps?

BenFradet commented 7 years ago

FYI, we're using Elasticity's applications in EmrEtlRunner:

which seem to be added to the payload here:

https://github.com/rslifka/elasticity/blob/master/lib/elasticity/job_flow.rb#L221

alexanderdean commented 7 years ago

Assigning to Ben 😄