NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

Update the Qual tool AutoTuner Heuristics against CPU event logs #1069

Closed tgravescs closed 4 weeks ago

tgravescs commented 1 month ago

fixes https://github.com/NVIDIA/spark-rapids-tools/issues/1068

This enhances heuristics around the spark.executor.memory and handles cases where the memory to core ratio is to small. It will throw an exception and not put out tunings if the core/memory ratio is to small. In the future we should just tag this and recommend the sizes.

This also adds in extra overhead since worst case we need space for both pinned memory and spill memory. It gets a little complicated since spill will use pinned memory, but if its used it will use regular off heap. So here we set things at worst case which is it needs both.

I also added in heuristics for configuring the multithreaded readers - num threads and some sizes and also the shuffle reader/writer thread pools based on the number of cores.
Most of the heuristics are based on what we saw from real customer workloads and NDS results.

Most of this testing was on CSPs, I will try to apply more to onprem later.

note most of this functionality needs the worker information passed in --worker-info ./worker_info-demo-gpu-cluster.yaml

Example:

system:
  numCores: 8 
  memory: 15360MiB
  numWorkers: 4
softwareProperties:
  spark.scheduler.mode: FAIR

With the worker info:

Spark Properties:
--conf spark.executor.cores=8
--conf spark.executor.instances=4
--conf spark.executor.memory=8192m
--conf spark.rapids.filecache.enabled=true
--conf spark.rapids.memory.pinnedPool.size=3584m
--conf spark.rapids.shuffle.multiThreaded.reader.threads=20
--conf spark.rapids.shuffle.multiThreaded.writer.threads=20
--conf spark.rapids.sql.batchSizeBytes=2147483647
--conf spark.rapids.sql.concurrentGpuTasks=3
--conf spark.rapids.sql.multiThreadedRead.numThreads=20
--conf spark.shuffle.manager=com.nvidia.spark.rapids.spark321db.RapidsShuffleManager
--conf spark.sql.adaptive.coalescePartitions.minPartitionSize=4m
--conf spark.sql.adaptive.coalescePartitions.parallelismFirst=false
--conf spark.sql.shuffle.partitions=200
--conf spark.task.resource.gpu.amount=0.125

Without the worker info:

Spark Properties:
--conf spark.rapids.filecache.enabled=true
--conf spark.shuffle.manager=com.nvidia.spark.rapids.spark321db.RapidsShuffleManager
--conf spark.sql.shuffle.partitions=200
amahussein commented 4 weeks ago

--worker-info ./worker_info-demo-gpu-cluster.yaml

Should we add that file to the repo? Perhaps inside inside tests/resources?

tgravescs commented 4 weeks ago

Should we add that file to the repo? Perhaps inside inside tests/resources?

Sure I can add it.

I also realized I wanted to add a few more tests to the Suite so I'll do that and push some updates shortly.