NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
50 stars 37 forks source link

[BUG] Qualification tool may show negative numbers in GPU estimates #1165

Open amahussein opened 3 months ago

amahussein commented 3 months ago

Describe the bug

While investigating an eventlog, I found that an incomplete eventlog has negative values in the estimated GPU. Also, the Estimated app duration was false while it should have been set to True.

App Name,App ID,Recommendation,Estimated GPU Speedup,Estimated GPU Duration,Estimated GPU Time Saved,SQL DF Duration,SQL Dataframe Task Duration,App Duration,GPU Opportunity,Executor CPU Time Percent,SQL Ids with Failures,Unsupported Read File Formats and Types,Unsupported Write Data Format,Complex Types,Nested Complex Types,Potential Problems,Longest SQL Duration,SQL Stage Durations Sum,NONSQL Task Duration Plus Overhead,Unsupported Task Duration,Supported SQL DF Task Duration,Task Speedup Factor,App Duration Estimated
"app-name-cyz","app-id-0123456","Not Recommended",0.98,590902.95,-6876.96,349271,494279631,584026,-75646,43.61,"","","HIVEPARQUET","","","",355133,744690,234755,601332705,-107053074,1.1,false

Another interesting observation: the same eventlog has multiple stageAttempts sql_to_stage_information.csv