NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
822 stars 235 forks source link

[FEA] Record GPU time and Fetch time separately, instead of recording Total Time #2334

Closed andygrove closed 3 years ago

andygrove commented 3 years ago

Is your feature request related to a problem? Please describe. The totalTime metric can be confusing because it often (but not always) includes the execution time of the child operator(s). It would be better to separately record fetchTime and gpuOpTime, although this may be difficult or not possible in some cases.

Making this change will also provide better metrics that we can use when profiling queries or developing a better cost model for CBO.

Describe the solution you'd like

Review the following operators and determine whether we can improve the metrics to record the GPU time and fetch time separately. File new issues for those that we plan on doing this for.

Describe alternatives you've considered None

Additional context None

andygrove commented 3 years ago

@nartal1 fyi - this may have an impact on the profiling tools