mila-iqia / milatools

Tools to connect to and interact with the Mila cluster
MIT License
60 stars 11 forks source link

[v0.1.4-post.9+bf82ca9] Issue running the command `mila init` #128

Open MxMstrmn opened 2 months ago

MxMstrmn commented 2 months ago

Make sure you can reproduce the issue with the latest version available

I am using the latest version of milatools

What command did you run?

mila init

Describe the bug

There is an error when setting up the ssh keys across the cluster.

Additional context

/usr/bin/ssh-copy-id: ERROR: kex_exchange_identification: Connection closed by remote host
ERROR: Connection closed by 172.16.2.25 port 2222

Traceback (most recent call last):
  File "/home/mila/a/user/miniforge3/lib/python3.10/site-packages/milatools/cli/commands.py", line 91, in main
    mila()
  File "/home/mila/a/user/miniforge3/lib/python3.10/site-packages/milatools/cli/commands.py", line 137, in mila
    return function(**args_dict)
  File "/home/mila/a/user/miniforge3/lib/python3.10/site-packages/milatools/cli/commands.py", line 523, in init
    success = setup_passwordless_ssh_access(ssh_config=ssh_config)
  File "/home/mila/a/user/miniforge3/lib/python3.10/site-packages/milatools/cli/init_command.py", line 256, in setup_passwordless_ssh_access
    success = setup_passwordless_ssh_access_to_cluster("mila")
  File "/home/mila/a/user/miniforge3/lib/python3.10/site-packages/milatools/cli/init_command.py", line 347, in setup_passwordless_ssh_access_to_cluster
    here.run(
  File "/home/mila/a/user/miniforge3/lib/python3.10/site-packages/milatools/utils/local_v1.py", line 47, in run
    return subprocess.run(
  File "/home/mila/a/user/miniforge3/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('ssh-copy-id', '-i', '/home/mila/a/luser/.ssh/id_rsa', '-o', 'StrictHostKeyChecking=no', 'mila')' returned non-zero exit status 1.