dti-research / tracker

Tracker is a CLI for easy creation of reproducible Robotics and ML research
https://dti-tracker.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Missing catch of exception on ssh timeout #10

Open nily-dti opened 4 years ago

nily-dti commented 4 years ago
$ tracker gpus list -r dti-ai-1-fashion 
ssh: connect to host 10.44.60.128 port 22: Connection timed out
Traceback (most recent call last):
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/remotes/ssh_util.py", line 77, in ssh_output
    out = subprocess.check_output(cmd)
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ssh', '-oStrictHostKeyChecking=no', '-i', '~/.ssh/id_rsa.pub', 'dti@10.44.60.128', 'which nvidia-smi']' returned non-zero exit status 255.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/tracker", line 11, in <module>
    load_entry_point('tracker', 'console_scripts', 'tracker')()
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/main.py", line 14, in main
    main_commands.main()
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/utils/click_utils.py", line 41, in fn
    return fn0(*(args + (Args(**kw),)))
  File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/commands/gpus_list.py", line 45, in list_gpus
    gpu_handler = gpu.GPU(remote=remote)
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/utils/gpu.py", line 63, in __init__
    self._stats_cmd = self._run_which_cmd()
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/utils/gpu.py", line 88, in _run_which_cmd
    nvidia_smi = self._remote.which("nvidia-smi")
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/remotes/ssh.py", line 82, in which
    port=self.port)
  File "/home/nily/Workspace/ml-template-ws/tracker/tracker/remotes/ssh_util.py", line 79, in ssh_output
    raise remotelib.RemoteProcessError.from_called_process_error(e)
tracker.remote.RemoteProcessError: (255, ['ssh', '-oStrictHostKeyChecking=no', '-i', '~/.ssh/id_rsa.pub', 'dti@10.44.60.128', 'which nvidia-smi'], b'')