NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
56 stars 38 forks source link

user-tools should add xms argument to java cmd #1391

Closed amahussein closed 1 month ago

amahussein commented 1 month ago

Signed-off-by: Ahmed Hussein ahussein@nvidia.com

Fixes #1382

Upon investigation, it was revealed that the min heap size could impact the runtime significantly. (see the linked issue for details of the performance impact) This code change aims at setting the xms java argument to 50% of the max heap size.

sample runtime.properties file. The lines that are generated by the changes in this PR are marked with >

#RAPIDS Accelerator for Apache Spark's Build/Runtime Information
#Wed Oct 23 20:26:21 UTC 2024
build.scala.version=2.12.15
build.hadoop.version=3.3.6
build.spark.version=3.5.0
> runtime.os.version=6.8.0-39-generic
> runtime.jvm.version=1.8.0_422
build.version=24.08.3-SNAPSHOT
runtime.spark.version=3.4.2
> runtime.jvm.arg.heap.min=50g
> runtime.jvm.name=OpenJDK 64-Bit Server VM
> runtime.jvm.arg.gc.UseG1GC=
> runtime.os.name=Linux
> runtime.jvm.arg.heap.max=100g
build.java.version=1.8.0_422

Details

This pull request updates the handling of JVM heap arguments in the Spark RAPIDS tool.

Enhancements to RuntimeUtil:

Updates to JVM heap arguments handling:

Related and followups:

nartal1 commented 1 month ago

Thanks @amahussein for investigating and putting up the fix for this. Just a nit.

amahussein commented 1 month ago

Thanks @amahussein for investigating and putting up the fix for this. Just a nit.

Thanks @nartal1 ! Removed the debugging code.