NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

[FEA] Qualification tool: Add operators stats output csv file #1157

Open nartal1 opened 1 week ago

nartal1 commented 1 week ago

Is your feature request related to a problem? Please describe. Currently we have rapids_4_spark_qualification_output_unsupportedOperators.csv which has details about unsupported operators and rapids_4_spark_qualification_output_execs.csv which has details about the execs. It would be good to add another csv file which has additional stats on the operators. It would help to determine the frequency of the operators in an application, task durations of the operators and so on.

Sample output format: operator_stats.csv

AppId, SQLID, Operator_Name, Count, Total Task Exec Duration(Seconds), Supported(Boolean)

where Total Task Exec Duration = sum of task exec duration which the operator is part of