$ mila code /network/scratch/n/normandf/imagenet_template --alloc --cpus-per-task=4 --gres=gpu:1 --mem=16G --nodes 2
(local) $ ssh mila -fNMS /home/fabrice/.ssh/sockets/milatools.mila
(mila) $ salloc --cpus-per-task=4 --gres=gpu:1 --mem=16G --nodes 2
# Control socket connect(/home/fabrice/.ssh/sockets/milatools.mila): Connection refused
# salloc: --------------------------------------------------------------------------------------------------
# salloc: # Using default long partition
# salloc: --------------------------------------------------------------------------------------------------
# salloc: Pending job allocation 2149062
# salloc: job 2149062 queued and waiting for resources
# salloc: job 2149062 has been allocated resources
# salloc: Granted job allocation 2149062
# salloc: Waiting for resource configuration
# salloc: Nodes cn-c[007,035] are ready for job
(local) $ code --remote 'ssh-remote+cn-c[007,035].server.mila.quebec' /network/scratch/n/normandf/imagenet_template
VSCode isn't able to connect to the host. This is probably due to how the node name is retrieved inside the mila code command.
Here is some of the output inside the VsCode: Remote - SSH log window:
[10:46:47.518] ------
[10:46:47.518] SSH Resolver called for "ssh-remote+cn-c[007,035].server.mila.quebec", attempt 5, (Reconnection)
[10:46:47.519] SSH Resolver called for host: cn-c[007,035].server.mila.quebec
[10:46:47.519] Setting up SSH remote "cn-c[007,035].server.mila.quebec"
[10:46:47.520] Acquiring local install lock: /tmp/vscode-remote-ssh-855227e0-install.lock
[10:46:47.520] Looking for existing server data file at /home/fabrice/.config/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-855227e0-6d9b74a70ca9c7733b29f0456fd8195364076dda-0.84.0/data.json
[10:46:47.520] Using commit id "6d9b74a70ca9c7733b29f0456fd8195364076dda" and quality "stable" for server
[10:46:47.522] Install and start server if needed
[10:46:47.527] askpass server listening on /run/user/1001/vscode-ssh-askpass-e7e86db49903dbff72edcece1901eb8c786021c3.sock
[10:46:47.528] Spawning local server with {"serverId":5,"ipcHandlePath":"/run/user/1001/vscode-ssh-askpass-bace1ed1c6ac07b4dc3b702a1e5f6f1d66ccc9e5.sock","sshCommand":"ssh","sshArgs":["-v","-T","-D","36461","-o","ConnectTimeout=15","cn-c[007,035].server.mila.quebec"],"serverDataFolderName":".vscode-server","dataFilePath":"/home/fabrice/.config/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-855227e0-6d9b74a70ca9c7733b29f0456fd8195364076dda-0.84.0/data.json"}
[10:46:47.528] Local server env: {"SSH_AUTH_SOCK":"/run/user/1001/keyring/ssh","SHELL":"/bin/bash","DISPLAY":":1","ELECTRON_RUN_AS_NODE":"1","SSH_ASKPASS":"/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/local-server/askpass.sh","VSCODE_SSH_ASKPASS_NODE":"/usr/share/code/code","VSCODE_SSH_ASKPASS_EXTRA_ARGS":"--ms-enable-electron-run-as-node","VSCODE_SSH_ASKPASS_MAIN":"/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/askpass-main.js","VSCODE_SSH_ASKPASS_HANDLE":"/run/user/1001/vscode-ssh-askpass-e7e86db49903dbff72edcece1901eb8c786021c3.sock"}
[10:46:47.533] Spawned 6736
[10:46:47.630] > local-server-5> Spawned ssh, pid=6744
[10:46:47.634] stderr> OpenSSH_8.2p1 Ubuntu-4ubuntu0.5, OpenSSL 1.1.1f 31 Mar 2020
[10:46:47.635] stderr> Bad stdio forwarding specification '[cn-c[007,035].server.mila.quebec]:22'
[10:46:47.635] stderr> kex_exchange_identification: Connection closed by remote host
[10:46:47.635] > local-server-5> ssh child died, shutting down
[10:46:47.641] Local server exit: 0
[10:46:47.642] Received install output: local-server-5> Spawned ssh, pid=6744
OpenSSH_8.2p1 Ubuntu-4ubuntu0.5, OpenSSL 1.1.1f 31 Mar 2020
Bad stdio forwarding specification '[cn-c[007,035].server.mila.quebec]:22'
kex_exchange_identification: Connection closed by remote host
local-server-5> ssh child died, shutting down
[10:46:47.642] Failed to parse remote port from server output
[10:46:47.643] Resolver error: Error:
at Function.Create (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:585222)
at Object.t.handleInstallOutput (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:583874)
at Object.e [as tryInstallWithLocalServer] (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:624373)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async /home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:643506
at async Object.t.withShowDetailsEvent (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:647224)
at async /home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:622845
at async T (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:619351)
at async Object.t.resolveWithLocalServer (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:622460)
at async Object.t.resolve (/home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:644834)
at async /home/fabrice/.vscode/extensions/ms-vscode-remote.remote-ssh-0.84.0/out/extension.js:1:727082
[10:46:47.644] ------
VSCode isn't able to connect to the host. This is probably due to how the node name is retrieved inside the
mila code
command.Here is some of the output inside the VsCode: Remote - SSH log window: