Open AlJohri opened 4 years ago
Spark on EMR has a maximizeResourceAllocation option which makes it really easy to run a Spark Job with all of the cluster's resources. This uses the approach of creating "fat" workers (less workers with more resources).
maximizeResourceAllocation
An older version of this implementation can be found here: https://github.com/aws-samples/emr-bootstrap-actions/blob/6065841329f81d00e402df909bd69fff9af6e62e/spark/maximize-spark-default-config
It would be great to have a similar option when using Dask on EMR so the user has to do less custom configuration.
Spark on EMR has a
maximizeResourceAllocation
option which makes it really easy to run a Spark Job with all of the cluster's resources. This uses the approach of creating "fat" workers (less workers with more resources).An older version of this implementation can be found here: https://github.com/aws-samples/emr-bootstrap-actions/blob/6065841329f81d00e402df909bd69fff9af6e62e/spark/maximize-spark-default-config
It would be great to have a similar option when using Dask on EMR so the user has to do less custom configuration.