mila-iqia / milatools

Tools to connect to and interact with the Mila cluster
MIT License
63 stars 12 forks source link

Impossible to "ssh-copy-id mila" in the "mila init" process on Windows 11 #63

Closed Basile-Terv closed 11 months ago

Basile-Terv commented 1 year ago

Make sure you can reproduce the issue with the latest version available

pip install milatools --upgrade
[milatools command e.g. mila code ...]

What command did you run?

mila init Then i entered my username on the mila cluster as asked by the command prompt. Then I answered to the question

+Host mila
+  HostName login.server.mila.quebec
+  User basile.terver
+  PreferredAuthentications publickey,keyboard-interactive
+  Port 2222
+  ServerAliveInterval 120
+  ServerAliveCountMax 5
+
+
+Host mila-cpu
+  User basile.terver
+  Port 2222
+  ForwardAgent yes
+  StrictHostKeyChecking no
+  LogLevel ERROR
+  UserKnownHostsFile /dev/null
+  RequestTTY force
+  ConnectTimeout 600
+  ServerAliveInterval 120
+  ProxyCommand ssh mila "/cvmfs/config.mila.quebec/scripts/milatools/slurm-proxy.sh mila-cpu --mem=8G"
+  RemoteCommand /cvmfs/config.mila.quebec/scripts/milatools/entrypoint.sh mila-cpu
+
+
+Host *.server.mila.quebec !*login.server.mila.quebec
+  HostName %h
+  User basile.terver
+  ProxyJump mila
+
?
Is this OK?

Then I answered yes to the question ? You have no public keys. Generate one? Then I answered yes to the question ? Your public key does not appear be registered on the cluster. Register it? Yes Then I got the error below.

Describe the bug

In the "mila init process" on my Windows PC, I get stucked at this point, although I upgrade the milatools package.

Traceback (most recent call last):
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\milatools\cli\commands.py", line 43, in main
    auto_cli(milatools)
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\coleo\cli.py", line 656, in auto_cli
    result = run_cli(entry, args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\coleo\cli.py", line 628, in run_cli
    return call(opts=opts, args=args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\coleo\cli.py", line 587, in thunk
    result = fn(*args)
             ^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\milatools\cli\commands.py", line 161, in init
    here.run("ssh-copy-id", "mila")
  File "C:\Users\terve\anaconda3\envs\mila\Lib\site-packages\milatools\cli\local.py", line 28, in run
    return subprocess.run(
           ^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\subprocess.py", line 548, in run
    with Popen(*popenargs, **kwargs) as process:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\terve\anaconda3\envs\mila\Lib\subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\terve\anaconda3\envs\mila\Lib\subprocess.py", line 1538, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] Le fichier spécifié est introuvable

Screenshots

image (4) image (3)

image (2) image (1) image

Desktop (please complete the following information):

Additional context

I started following Victor Schmidt's guide (https://vsch.notion.site/YAMSS-5471da23464e41d4bad5e3517d273dea#0742904b6e384aba94b29a24b69e7b0e) on my WSL machine. But I decided to try milatools because I am on Windows, which is a problem if I want to open VS Code on a compute node following Victor's guide.

ZHANG-GuiGui commented 1 year ago

Hi, I have the same problem. I found one issue may be helpful. https://github.com/mila-iqia/milatools/issues/19