apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
823 stars 163 forks source link

docs: Update tuning guide #995

Closed andygrove closed 1 month ago

andygrove commented 1 month ago

Which issue does this PR close?

Part of https://github.com/apache/datafusion-comet/issues/949

Rationale for this change

Provide better documentation for tuning memory usage.

Rendered version of this PR

What changes are included in this PR?

How are these changes tested?

andygrove commented 1 month ago

@Kontinuation Could you review?

andygrove commented 1 month ago

Thanks for the reviews @Kontinuation @comphead @viirya.

sunchao commented 1 month ago

I don't remember any issue related to off-heap memory mode itself, but just that all the memory related configurations need to be careful tuned. For instance we may need to still reserve some JVM memory for certain operations (like broadcast?).

One thing I was trying to do is to hide all these configuration changes behind the Comet driver plugin, so when user enables Comet, the existing job configurations, including spark.executor.memory, spark.executor.memoryOverhead, etc, would be converted to offheap memory transparently. This would require some Spark-side changes, such as https://issues.apache.org/jira/browse/SPARK-46947. There is one more change I did internally to use Java reflection to overwrite certain config in Spark memory manager, because of early initialization in Spark.