NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
783 stars 228 forks source link

Investigate long GPU semaphore wait time #11188

Open jihoonson opened 1 month ago

jihoonson commented 1 month ago

Screenshot 2024-07-15 at 10 54 36 AM

The screenshot above shows an nsys trace of a task during NDS-H Q21 execution. As seen, this task was blocked on the GPU semaphore for most of its execution time. The NVTX Push/Pop Range Summary shows that this pattern is found not only for this particular task, but for all tasks in the same time range in this node. Here is the settings I used for my testing.

With these settings, there were up to 16 tasks trying to acquire the semaphore at the same time. Given that the semaphore size was set to 4, the wait time in the above screenshot seems suspiciously long. It seems worth looking into this and see if we can avoid or work around it.

revans2 commented 1 month ago

It might be a little long, but it is not an order of magnitude off. The hash join gather map range a little below the Acquire GPU range in the table is 2.1% of the time (96.5 s), while Acquire GPU is 11.8% (530.5 s) We have a ratio of 1 task on the GPU to 4 tasks total, So I would expect the wait time for the GPU to be about 3x to 4x the computation time on the GPU. That gives us 133 to 177 seconds of computation on the GPU. That means the join gather accounts for 50% to 70% (back of the envelope calculation) of what we expect. I agree that looking into it would be good, but I don't see any really large red flags in the screenshot.

jihoonson commented 1 month ago

@revans2 thanks for the input! Your analysis sounds reasonable. Though, there is one particular thing that is not described above is that, per nsys trace, it seems like we are holding the semaphore longer than we need while iterating inputs. I did not mention it originally as I need to gather more information about it. Perhaps this is not a big deal as you described, but I would like to double check it. I will post my finds here. Thanks!