NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

Support profiling for specific stages on a limited number of tasks #11708

Closed thirtiseven closed 2 days ago

thirtiseven commented 2 weeks ago

Close https://github.com/NVIDIA/spark-rapids/issues/11666

In a customer use case we found that the profiling files were sometimes very large even when we enabled stage/time range limit config on a job.

This pr add support for profiling on only a limited number of tasks for specific stages.

before this pr:

Screenshot 2024-11-07 at 15 11 24

after this pr, task limit = 5:

Screenshot 2024-11-15 at 21 36 06

This nsys-rep file size dropped from 10.4 Mb to 474 Kb.

thirtiseven commented 2 weeks ago

build

thirtiseven commented 2 days ago

build