NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
50 stars 37 forks source link

[BUG] Profiling/Qualification Tool does not contain status info for failed event log #1164

Closed cindyyuanjiang closed 2 months ago

cindyyuanjiang commented 3 months ago

Describe the bug The Profiling tool runs into an exception in the java tool side on an ABFS log. If user does not turn on --verbose, there is no information about this failure in the console/output folder/output log. This could be confusing to the user.

Steps/Code to reproduce bug spark_rapids profiling -p databricks-azure --eventlogs <my-event-log> --cluster <my-cluster>

Expected behavior It will be nice the profiling_status.csv or qualification_output_status.csv(For Qualification tool) file captures this exception.

amahussein commented 3 months ago

This bug is mainly a problem that Failed authentication is tolerated by the Scala code but it is not passed to the status.csv file. The python code won't be able to tell an error occurred unless the java exits with non-0 value. The fix to this issue is two folds: