NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
49 stars 35 forks source link

[FEA] Autotuner should recommend tuned settings (batchSizeBytes, concurrentGpuTasks) for OOM conditions #249

Open mattahrens opened 1 year ago

mattahrens commented 1 year ago

I would like the profiler tool recommendations to optimize for OOM scenarios where there are stats from OOM retry metrics or OOM failed tasks that should allow adjustments for Spark RAPIDS settings to reduce OOM and enhance performance.

tgravescs commented 1 month ago

is this on GPU ooms? Which implies the event log here was from GPU run

mattahrens commented 1 month ago

Yes, this was originally from GPU event logs when auto-tuner was only run with profiler on GPU runs.