NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
788 stars 228 forks source link

Registering and unregistering host buffers with the spill framework is costly #11322

Open jlowe opened 1 month ago

jlowe commented 1 month ago

While working on #11234 I noticed a significant amount of time was being used by HostHostJoinSizer.setupForJoin which takes the batches already fetched from shuffle and collected as spillable buffers in a queue and iterates through them to prepare for concatenation. Doing this iteration ends up making the data unspillable and unregisters the buffers from the spill framework. For 200 buffers, this was taking many tens of milliseconds. When I hacked SpillableHostConcatResult to remove the use of SpillableHostBuffer, the time disappeared.

abellina commented 1 month ago

I believe, yet to be confirmed, that the TableMeta operations we are doing are very time consuming and we have talked about removing that. We talked about removing these https://github.com/NVIDIA/spark-rapids/issues/7668 so if it's confirmed that this is the issue, more motivation to get that done. Although this issue talked about pushing it to the Spillable objects, which means we still need to do it.

mattahrens commented 1 month ago

Initial scope is a short-term spike to profile to confirm root cause and path for optimization.