NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
54 stars 37 forks source link

[BUG] Profiler output shows inconsistent shuffleWriteTime results #1408

Open cindyyuanjiang opened 2 weeks ago

cindyyuanjiang commented 2 weeks ago

Describe the bug Profiler output shows inconsistent shuffle write write time results in profiler.log.

Under Stage Level All Metrics, the total is 93 ms.

+--------+-------+-------------+---------------------------------------------------+---------+---------+---------+----------+
|appIndex|stageId|accumulatorId|name                                               |min      |median   |max      |total     |
+--------+-------+-------------+---------------------------------------------------+---------+---------+---------+----------+
|1       |2      |201          |internal.metrics.shuffle.write.writeTime           |139875   |230038   |9747416  |93193331  |

Under Stage level aggregated task metrics, the total is 42ms.

+--------+-------+--------+--------+--------------------+------------+------------+------------+------------+-------------------+------------------------------+---------------------------+-------------------+-------------------+---------------------+-------------+----------------------+-----------------------+-------------------------+-----------------------+---------------------------+--------------+--------------------+-------------------------+---------------------+--------------------------+----------------------+----------------------------+---------------------+-------------------+---------------------+----------------+
|appIndex|stageId|numTasks|Duration|diskBytesSpilled_sum|duration_sum|duration_max|duration_min|duration_avg|executorCPUTime_sum|executorDeserializeCPUTime_sum|executorDeserializeTime_sum|executorRunTime_sum|input_bytesRead_sum|input_recordsRead_sum|jvmGCTime_sum|memoryBytesSpilled_sum|output_bytesWritten_sum|output_recordsWritten_sum|peakExecutionMemory_max|resultSerializationTime_sum|resultSize_max|sr_fetchWaitTime_sum|sr_localBlocksFetched_sum|sr_localBytesRead_sum|sr_remoteBlocksFetched_sum|sr_remoteBytesRead_sum|sr_remoteBytesReadToDisk_sum|sr_totalBytesRead_sum|sw_bytesWritten_sum|sw_recordsWritten_sum|sw_writeTime_sum|
+--------+-------+--------+--------+--------------------+------------+------------+------------+------------+-------------------+------------------------------+---------------------------+-------------------+-------------------+---------------------+-------------+----------------------+-----------------------+-------------------------+-----------------------+---------------------------+--------------+--------------------+-------------------------+---------------------+--------------------------+----------------------+----------------------------+---------------------+-------------------+---------------------+----------------+
|1       |2      |200     |688     |0                   |7705        |237         |10          |38.5        |1556               |940                           |1474                       |5337               |0                  |0                    |0            |0                     |0                      |0                        |0                      |0                          |7359          |0                   |2400                     |80264508             |0                         |0                     |0                           |80264508             |15400              |200                  |42              |

We suspect this is due to precision lost from converting nanoseconds to milliseconds. For Stage Level All Metrics, the results are in nanoseconds, and for Stage level aggregated task metrics, the results are in milliseconds. Most tasks in this stage has < 1,000,000 nanoseconds = 1 ms shuffle write time, which will rounds to 0 ms.

Steps/Code to reproduce bug spark_rapids profiling -v -e <my_tools_repo>/core/src/test/resources/spark-events-profiling/rapids_join_eventlog.zstd

cindyyuanjiang commented 1 week ago

Found conversion from nanoseconds to ms in TaskModel construction: https://github.com/NVIDIA/spark-rapids-tools/blob/019ede2fdc87f109f895ce67161c506dd377d80a/core/src/main/scala/org/apache/spark/sql/rapids/tool/store/TaskModel.scala#L112