The CLI throws an error when the java cmd fails to complete
ERROR rapids.tools.qualification: Failed to download dependencies Error invoking CMD <java -XX:+UseG1GC -Xmx8g -cp...
2024-06-04 22:31:23,970 ERROR root: Qualification. Raised an error in phase [Execution]
Traceback (most recent call last):
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py", line 114, in wrapper
func_cb(self, *args, **kwargs) # pylint: disable=not-callable
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py", line 188, in _execute
self._run_rapids_tool()
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py", line 643, in _run_rapids_tool
self._submit_jobs()
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py", line 913, in _submit_jobs
raise ex
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py", line 909, in _submit_jobs
result = future.result()
File "~/.pyenv/versions/3.8-dev/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "~/.pyenv/versions/3.8-dev/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "~/.pyenv/versions/3.8-dev/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/rapids/rapids_job.py", line 105, in run_job
job_output = self._submit_job(cmd_args)
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/rapids/rapids_job.py", line 151, in _submit_job
out_std = self.exec_ctxt.platform.cli.run_sys_cmd(cmd=cmd_args,
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/cloud_api/sp_types.py", line 473, in run_sys_cmd
return sys_cmd.exec()
File "~/rapids-tools/user_tools/src/spark_rapids_pytools/common/utilities.py", line 333, in exec
raise RuntimeError(f'{cmd_err_msg}')
To reproduce, you can simply run the cmd then kill the java process using kill-signal.
The stack-trace does not show that the Java was killed
It was confusing and hard to see. Users thought that no errors was thrown
### Tasks
- [ ] https://github.com/NVIDIA/spark-rapids-tools/issues/1088
- [ ] Improve the Python logging. there are issues opened for that purpose.
Describe the bug
The CLI throws an error when the java cmd fails to complete
To reproduce, you can simply run the cmd then kill the java process using kill-signal.