snowplow / dataflow-runner

Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR
http://snowplowanalytics.com
19 stars 8 forks source link

Consider supporting spot instances #46

Open BenFradet opened 6 years ago

BenFradet commented 6 years ago

The situation has been getting better wrt Spark jobs running on spot instances in EMR recently (https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-configure.html) so it might be interesting to support them.

BenFradet commented 6 years ago

see snowplow/snowplow#3634

VolatileBit commented 1 year ago

I'd like to bump this request.

Recently we've encountered out of capacity issues from EMR, which requires us to configure the EMR clusters to be more flexible with instance provision. This required using instance fleets in EMR, which currently the dataflow runner cluster config schema doesn't support.

By supporting instance fleets, there are a couple of benefits:

Is it possible to support this?