NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
53 stars 37 forks source link

[BUG] `rapids_4_spark_qualification_output_status.csv` file does not report apps with zero SQL time #1306

Closed kuhushukla closed 1 month ago

kuhushukla commented 2 months ago

Describe the bug Take an app which is all RDDs , no dataframes. Such an app will be processed and SQL duration shows up as zero in the verbose output, which is expected. However the qual tool rapids_4_spark_qualification_output_status.csv does not contain an entry on why the qualification decided to exclude it in the final list of qualified apps. The status file should tell that the duration for SQL is zero or that the APIs being called are unsupported for this app.

Steps/Code to reproduce bug Described above. The o/p says it processed this app successfully.

Expected behavior Such a run should generate root cause as part of the status csv o/p.

Environment details (please complete the following information) onprem HDFS

amahussein commented 2 months ago

I wonder if this was a symptom of writing writing the core output onto HDFS.. I am pretty sure that we had eventlogs generated by QA that do nothing and I cannot recall having them disappearing completely.

amahussein commented 1 month ago

Closing as invalid