spoddutur / spark-notes

https://spoddutur.github.io/spark-notes/
303 stars 138 forks source link

Heap memory miscalculation? #11

Open stochastic1 opened 4 years ago

stochastic1 commented 4 years ago

From: https://github.com/spoddutur/spark-notes/blob/master/distribution_of_executors_cores_and_memory_for_spark_application.md

Memory per executor = 64GB/3 = 21GB
Counting off heap overhead = 7% of 21GB = 3GB. So, actual --executor-memory = 21 - 3 = 18GB

3GB is 14% of 21GB. Either we should be setting 19GB or 20GB as executor-memory, or I'm misunderstanding the 7% of Heap space. This is a good guide and I'm actively using it to optimize spark settings in dataproc; please clarify this one point, if you don't mind.

datasherlock commented 1 year ago

+1