NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
50 stars 37 forks source link

[BUG] Qualification tool labels app run on Databricks as Not Applicable because of failed stages but there were cancelled #1032

Closed tgravescs closed 4 months ago

tgravescs commented 4 months ago

Describe the bug I ran an event log through the qualification tool and it got labelled as not applicable because it had failed stages. Those failed stages though were cancelled by AQE runs.

We should take this into account in the qual tool.

The reasons in task show up as: Stage cancelled... The stage failure reason shows: Job 243 cancelled

tool output: 24/05/23 10:00:26 WARN QualificationEventProcessor: SQL execution id 47 had failures, skipping 24/05/23 10:00:26 WARN QualificationEventProcessor: SQL execution id 125 had failures, skipping

amahussein commented 4 months ago

I noticed the same behavior too. I find this an important fix to do to clean up and improve the way Stages are reported in tools.