NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
49 stars 36 forks source link

[BUG] Qualification unsupported_ops report does not match speedup calculations #793

Closed amahussein closed 3 months ago

amahussein commented 7 months ago

Describe the bug

Since we rely on AccumulableIDs, It is knows that we cannot bind execs (that do not have metrics) to stages. The Q tool applies some heuristics to do best effort assigning Execs to stages during the Speedup-calculations. However, this is done as intermediate step and it is not reported anywhere. For example, the latest unsupported report rapids_4_spark_qualification_output_unsupportedOperators.csv lists all projects and other execs as having stageID= -1.

This affects anyone trying to verify the speedup calculations or do aggregations based on unsupported_ops per stages.

Expected behavior

The heuristics used to assign the stageIDs to Execs should be part of the final generated report of the execs. If we there is a oncern that we mix between facts and estimations, we can add another column stating which heuristic used to assign an exec to stage

amahussein commented 6 months ago

I found one of the problems in which getStageToExec does not update the stageSet written to the ExecInfo. That's why we have a gap between what the stageMap is telling us compared to the PlanInfos.execInfo