Closed douglaz closed 7 years ago
What's the motivation for adding this?
I am wary about adding more options for Spark since there is a potentially a lot of stuff we could end up duplicating into Flintrock. Typically, I'd just say "Use run-command
to deploy whatever config changes you want to the cluster." It's clumsy but generic.
Regarding executor instances in particular, why not vary your instance size or count if you want to control how many executors you have?
@nchammas I'm sorry, but I'm so used to this feature that I don't even know how people do without it. I mean, how can you run the same job in a x1.32xlarge
and a r3.xlarge
without using this option or, alternatively, tweaking other JVM/Spark parameters?
Our solution is to break any big machine (big machine is any machine with more than 60GB of ram) in small machines (JVMs) with ~30GB. So in this example, we will set the x1.32xlarge to 64 executor instances. In this way we always have a homogenous cluster from the perspective of executor JVM, letting us use the same GC defaults for any configuration. Also, even considering the case where no GC tuning is needed, like when the machine is an r3.2xlarge, multiple smaller JVMs seems to run faster than a single big JVM (e.g networking/data transfer is improved).
So this simple single option let us avoid the need to use many other configurations and handling different machines in different ways.
@nchammas rewritten to conform with current master
Thanks for sharing the additional information. Makes sense to me. I'm curious if @BenFradet, @serialx, or any other users using Flintrock with large instance types have felt the need to use this setting before.
I'll try to code review this PR next, hopefully by next weekend (though, as with most open source, I can't make any promises).
Code seems to be okay to me. export SPARK_EXECUTOR_CORES="$(($(nproc) / {spark_executor_instances}))"
could be a reasonable default.
It is recommended to set SPARK_EXECUTOR_INSTANCES
when the instance has many cores.
I've also used to set SPARK_EXECUTOR_INSTANCES
greater than 2
when we used spark-ec2 script. But resorted to using SPARK_EXECUTOR_INSTANCES=1
after we migrated to flintrock. It would be good to support this feature in flintrock. This settings also makes sense when doing the benchmarks too.
Sorry for dragging my feet on this. Thank you for contributing this feature and persisting through the review process @douglaz!
Thanks also to @serialx for providing his thoughts, which helped me decide what to do here.
This PR makes the SPARK_EXECUTOR_INSTANCES be configurable.
I tested this PR by manually launching a cluster and running a job.