NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
43 stars 34 forks source link

[FEA] Improve Logging in Tools #1038

Open parthosa opened 1 month ago

parthosa commented 1 month ago

For every tools run using the python CMD, log files are stored at $HOME/.spark_rapids_tools/logs/qual_2024xxx.logs.

However, there are some areas where we can improve the overall logging:

### Tasks
- [ ] JAR logs are not captured when running from user tools (python CMD), even when the user runs with the `--verbose` mode. #735 
- [ ] Log files do not store logs at the ERROR level. We should store ERROR level logs in log files.
- [ ] For all ERROR logs in the except part of a `try-except` block, we should include the [filename:line number] or a traceback of at least the last two method calls for better debugging.
- [ ] In Python, caught exceptions shown as ERROR logs should include line numbers or trimmed traceback.
- [ ] Console output from Python Qualification CMD shows the directory structure of all apps in the `raw_metrics` folder. It becomes difficult to scroll, especially if the CMD was run on many event logs at once.
- [ ] https://github.com/NVIDIA/spark-rapids-tools/issues/1070
- [ ] https://github.com/NVIDIA/spark-rapids-tools/issues/735
- [ ] https://github.com/NVIDIA/spark-rapids-tools/issues/1057
amahussein commented 1 month ago

There is an open issue to generate log files from core tools. We should look into that first, because then we won't need to pipe the entire stdout/stderr into python memory before dumping it.

parthosa commented 1 month ago

Thanks @amahussein. Linked it in the description.