Closed tgravescs closed 1 month ago
they should be separate. when I looked briefly at the profiling tool, I know its outputting failed jobs to files. We still want to do that as that is how Spark is showing them. I didn't look at all the rollups though to see where it they are affected. Again a separate issue which I don't think is as important.
Fixes https://github.com/NVIDIA/spark-rapids-tools/issues/1032
I ran an event log through the qualification tool and it got labelled as not applicable because it had failed stages. Those failed stages though were cancelled by AQE runs.
We should take this into account in the qual tool.
The reasons in task show up as: Stage cancelled... The stage failure reason shows: Job 243 cancelled
tool output: 24/05/23 10:00:26 WARN QualificationEventProcessor: SQL execution id 47 had failures, skipping 24/05/23 10:00:26 WARN QualificationEventProcessor: SQL execution id 125 had failures, skipping
This PR fixes that by looking for cancelled in the failure messages ignores those as failures.
I tested on customer event log and this is working. Need to put that event log into our integration tests.