Closed amahussein closed 1 month ago
Thanks @amahussein for investigating and putting up the fix for this. Just a nit.
Thanks @amahussein for investigating and putting up the fix for this. Just a nit.
Thanks @nartal1 ! Removed the debugging code.
Signed-off-by: Ahmed Hussein ahussein@nvidia.com
Fixes #1382
Upon investigation, it was revealed that the min heap size could impact the runtime significantly. (see the linked issue for details of the performance impact) This code change aims at setting the xms java argument to 50% of the max heap size.
runtime.jvm.*
runtime.jvm.arg*
sample runtime.properties file. The lines that are generated by the changes in this PR are marked with
>
Details
This pull request updates the handling of JVM heap arguments in the Spark RAPIDS tool.
Xms
the jar CLI.RuntimeUtil
class by adding JVM and OS information to the runtime properties, extracting JVM heap arguments, and ensuring these arguments are correctly set in the user tools.Enhancements to
RuntimeUtil
:core/src/main/scala/org/apache/spark/sql/rapids/tool/util/RuntimeUtil.scala
: Added imports forManagementFactory
and Scala collection conversions.core/src/main/scala/org/apache/spark/sql/rapids/tool/util/RuntimeUtil.scala
: Added methods to include JVM and OS information in the runtime properties and to extract JVM heap arguments. [1] [2]Updates to JVM heap arguments handling:
user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py
: Updated the_re_evaluate_platform_args
method to set both minimum and maximum heap size arguments for the JVM.user_tools/src/spark_rapids_tools/utils/util.py
: Added logic to calculate and include the minimum heap size in the tool resources.Related and followups: