Open morikplay opened 3 years ago
Thanks for raising an issue @morikplay. In the current main
branch of distributed
, "cmd /c ver"
is being passed to self.connection.run
In logs you posted, cmd /c "ver"
is used instead. I'm wondering if this is where the invalid syntax is being introduced. You mentioned "when that gets 'fixed' (by me)", are you using a patched version of distributed
?
ahh... thank you for looking into the issue @jrbourbeau.
cmd /c "ver"
is just my typo whilst cleaning up the logs (for posting here).
cmd /c ver
is what is passed toself.connection.run()
, and that is what is causing result code -1 (which then fails cluster establishment). Changing cmd /c ver
to result = await self.connection.run("ver")
fixes the issue but causes other issues w/ environmental imports and such.
I used distributed
version that is available via conda. exporting env shows the following version:
distributed=2021.9.1=py39hcbf5309_0
Ah, I see -- thanks for clarifying @morikplay. Unfortunately I'm not familiar with Windows and don't have access to a machine to test things out. Perhaps @abduhbm has thoughts on how the current situation might be improved?
As suggested here: https://github.com/PowerShell/Win32-OpenSSH/issues/1373, changing cmd /c ver
to cmd.exe /c ver
should fix the issue on Windows Server 2019.
@morikplay Can you please try this change from your side?
Proposed change of cmd.exe /c ver
works for both scenarios: ssh'ing locally and ssh'ing remotely!
[INFO] 2021-10-25 09:15:27,016 logging.py:82 [conn=0, chan=1] Requesting new SSH session
[INFO] 2021-10-25 09:15:27,018 logging.py:82 [conn=0, chan=1] Command: cmd.exe /c ver
[INFO] 2021-10-25 09:15:27,037 logging.py:82 [conn=0, chan=1] Received exit status 0
[INFO] 2021-10-25 09:15:27,038 logging.py:82 [conn=0, chan=1] Received channel close
[INFO] 2021-10-25 09:15:27,039 logging.py:82 [conn=0, chan=1] Channel closed
[DEBUG] 2021-10-25 09:15:27,041 logging.py:82 [conn=0, chan=2] Set write buffer limits: low-water=16384, high-water=65536
[INFO] 2021-10-25 09:15:27,042 logging.py:82 [conn=0, chan=2] Requesting new SSH session
[INFO] 2021-10-25 09:15:27,043 logging.py:82 [conn=0, chan=2] Command: set DASK_INTERNAL_INHERIT_CONFIG=<XYZ..>
Thanks @morikplay ! I will create a PR for this.
What happened: SSHCluster() launch fails on Windows Server 2019 system. Turning on debug logs shows it fails at ln#179 (for scheduler), and when that gets 'fixed' (by me), it fails at ln#98 (for worker)
What you expected to happen: indicated scheduler+workers ought to launch via SSHCluster().
Minimal Complete Verifiable Example: Reproduced this error on multiple Windows 2019 server systems.
Anything else we need to know?: However, it introduces additional issue in that subsequent conda env changes fail (due to size mismatch), and also versionmismatch warning/errors start popping because the appropriate env doesn't load right. Consequently, can't view dashboard via bokeh and such.
Environment: OS: Windows Server 2019 v1809 Python: 3.9.7 Dask Distributed: 2021.9.1 Asyncssh: 2.7.1 Python version: 3.9.7 Install method: conda