ronf / asyncssh

AsyncSSH is a Python package which provides an asynchronous client and server implementation of the SSHv2 protocol on top of the Python asyncio framework.
Eclipse Public License 2.0
1.53k stars 150 forks source link

Parsing breaks ProxyCommand executable path on Windows. #677

Closed chipolux closed 3 weeks ago

chipolux commented 1 month ago

When a config has a ProxyCommand formatted like:

ProxyCommand C:\Windows\System32\OpenSSH\ssh.exe hostname -W %h:%p

this line removes the path separators leaving us with

['ProxyCommand', 'C:WindowsSystem32OpenSSHssh.exe', 'hostname', '-W', '%h:%p']

if changed to shlex.split(line, posix="\\" not in line) we get the correct version:

['ProxyCommand', 'C:\\Windows\\System32\\OpenSSH\\ssh.exe', 'hostname', '-W', '%h:%p']

https://github.com/ronf/asyncssh/blob/68c2df7fce1a096d2a819c1b777f11310dd795bb/asyncssh/config.py#L318

Not sure if that change would break anything else, or if it would be better to explicitly detect the platform and only swap to non-posix mode on Windows, I'm happy to make a PR for this minimal change if you'd like.

ronf commented 1 month ago

While turning off POSIX mode in shlex.split() does leave backslashes intact, my gut tells me this could break escaping that a caller might be doing , and https://stackoverflow.com/questions/33560364/python-windows-parsing-command-lines-with-shlex seems to confirm that both quoting and escaping can be problematic when you disable POSIX mode. Since backslashes are used for escaping in POSIX, the above may trigger disabling of POSIX mode and as a result cause those backslashes to not be interpreted properly.

Unfortunately, I'm not really sure what to suggest instead. To be honest, I wasn't even aware ProxyCommand worked on Windows. I would have expecting the I/O redirection needed to send data over pipes to the externally launched application to potentially not work on Windows.

For the specific case listed here, is there a reason you aren't using ProxyJump, or the something like connect_ssh() or the tunnel argument to connect()?

ronf commented 1 month ago

For what it's worth, you can work around the shlex issue by just doubling your backslashes when entering the ProxyCommand into your .ssh/config file. For instance:

ProxyCommand C:\\Windows\\System32\\OpenSSH\\ssh.exe hostname -W %h:%p

However, when I do this, I get a NotImplementedError related to making a subprocess transport. I think you would need to be using a ProactorEventLoop instead of a SelectorEventLoop to get that to exist on Windows.

chipolux commented 1 month ago

Yeah, it came out of left field for me too, but it's on users machines that I don't control and generated by some tool for them, so they are not comfortable changing their config.

Judging by the parser in use here and here the expected behavior is to only accept \\, \", \', and \, while passing through any other escape sequences. So shlex.split() doesn't quite behave the same when it discards the escape character, and changing out of posix mode isn't right either.

I'm fine using a workaround locally, but would also be happy to submit a PR sometime in the future with a matching parser to drop in as a replacement for shlex.split() if that would be helpful?

ronf commented 1 month ago

Hmm - that parsing code in OpenSSH looks like it has some issues. For instance, I don't think it properly handles something like:

ProxyCommand C:\"Program Files"\OpenSSH\ssh.exe -W %h:%p

It would end up thinking the first quote is escaped, stripping off the backslash. To avoid this, the command would need to be rewritten as something like:

ProxyCommand "C:\Program Files\OpenSSH\ssh.exe" -W %h:%p

Even then, a doubled backslash (even inside quotes) would have one of the backslashes stripped. This seems like it would be a problem if you were trying to pass in a pathname of a command running on a network share, where double backslashes are used in the path. I'm not sure how common that is (compared with mapping a network share to a drive letter with "NET USE"), but it's pretty clear this OpenSSH parser was not written with Windows in mind.

Since this value would generally be coming out of an OpenSSH config file, there's definitely something to be said for having AsyncSSH interpret the line exactly like OpenSSH would, bugs and all, but I wonder if the double backslash inside quotes should be left intact, similar to how backslash followed by space is left intact when inside quotes.

chipolux commented 1 month ago

Oh yeah, what a nightmare of a problem space I've stepped into!

I feel like having it interpret the line exactly like OpenSSH would is probably the least surprising, if your config worked with plain old ssh hostname it'd work here.

But it also feels bad since it seems like this working is just a coincidence of the path to ssh.exe being lucky enough to not include any valid escape sequences...

Just to throw out an idea, what about a configurable solution: leave the shlex based parsing in as default so no one get's broken, but allow selecting an "OpenSSH-like" mode that would behave the same, warts and all?

ronf commented 1 month ago

OK - here's something you can try:

diff --git a/asyncssh/config.py b/asyncssh/config.py
index 18ff3c8..a50888a 100644
--- a/asyncssh/config.py
+++ b/asyncssh/config.py
@@ -425,7 +425,7 @@ class SSHClientConfig(SSHConfig):
     """Settings from an OpenSSH client config file"""

     _conditionals = {'host', 'match'}
-    _no_split = {'remotecommand'}
+    _no_split = {'proxycommand', 'remotecommand'}
     _percent_expand = {'CertificateFile', 'IdentityAgent',
                        'IdentityFile', 'ProxyCommand', 'RemoteCommand'}

@@ -559,7 +559,7 @@ class SSHClientConfig(SSHConfig):
         ('PKCS11Provider',                  SSHConfig._set_string),
         ('PreferredAuthentications',        SSHConfig._set_string),
         ('Port',                            SSHConfig._set_int),
-        ('ProxyCommand',                    SSHConfig._set_string_list),
+        ('ProxyCommand',                    SSHConfig._set_string),
         ('ProxyJump',                       SSHConfig._set_string),
         ('PubkeyAuthentication',            SSHConfig._set_bool),
         ('RekeyLimit',                      SSHConfig._set_rekey_limits),
diff --git a/asyncssh/connection.py b/asyncssh/connection.py
index 9c05c5e..517fce4 100644
--- a/asyncssh/connection.py
+++ b/asyncssh/connection.py
@@ -115,7 +115,7 @@ from .misc import ProtocolNotSupported, ServiceNotAvailable
 from .misc import TermModesArg, TermSizeArg
 from .misc import async_context_manager, construct_disc_error
 from .misc import get_symbol_names, ip_address, map_handler_name
-from .misc import parse_byte_count, parse_time_interval
+from .misc import parse_byte_count, parse_time_interval, split_args

 from .packet import Boolean, Byte, NameList, String, UInt32, PacketDecodeError
 from .packet import SSHPacket, SSHPacketHandler, SSHPacketLogger
@@ -231,7 +231,7 @@ _GlobalRequest = Tuple[Optional[_PacketHandler], SSHPacket, bool]
 _GlobalRequestResult = Tuple[int, SSHPacket]
 _KeyOrCertOptions = Mapping[str, object]
 _ListenerArg = Union[bool, SSHListener]
-_ProxyCommand = Optional[Sequence[str]]
+_ProxyCommand = Optional[Union[str, Sequence[str]]]
 _RequestPTY = Union[bool, str]

 _TCPServerHandlerFactory = Callable[[str, int], SSHSocketSessionFactory]
@@ -7144,11 +7144,13 @@ class SSHConnectionOptions(Options):
         self.tunnel = tunnel if tunnel != () else config.get('ProxyJump')
         self.passphrase = passphrase

+        if proxy_command == ():
+            proxy_command = cast(Optional[str], config.get('ProxyCommand'))
+
         if isinstance(proxy_command, str):
-            proxy_command = shlex.split(proxy_command)
+            proxy_command = split_args(proxy_command)

-        self.proxy_command = proxy_command if proxy_command != () else \
-            cast(Sequence[str], config.get('ProxyCommand'))
+        self.proxy_command = proxy_command

         self.family = cast(int, family if family != () else
             config.get('AddressFamily', socket.AF_UNSPEC))
@@ -9224,7 +9226,7 @@ async def create_server(server_factory: _ServerFactory,
 async def get_server_host_key(
         host = '', port: DefTuple[int] = (), *,
         tunnel: DefTuple[_TunnelConnector] = (),
-        proxy_command: DefTuple[str] = (), family: DefTuple[int] = (),
+        proxy_command: DefTuple[_ProxyCommand] = (), family: DefTuple[int] = (),
         flags: int = 0, local_addr: DefTuple[HostPort] = (),
         sock: Optional[socket.socket] = None,
         client_version: DefTuple[BytesOrStr] = (),
@@ -9368,7 +9370,7 @@ async def get_server_host_key(
 async def get_server_auth_methods(
         host = '', port: DefTuple[int] = (), username: DefTuple[str] = (), *,
         tunnel: DefTuple[_TunnelConnector] = (),
-        proxy_command: DefTuple[str] = (), family: DefTuple[int] = (),
+        proxy_command: DefTuple[_ProxyCommand] = (), family: DefTuple[int] = (),
         flags: int = 0, local_addr: DefTuple[HostPort] = (),
         sock: Optional[socket.socket] = None,
         client_version: DefTuple[BytesOrStr] = (),
diff --git a/asyncssh/misc.py b/asyncssh/misc.py
index d3765f2..518b777 100644
--- a/asyncssh/misc.py
+++ b/asyncssh/misc.py
@@ -23,6 +23,7 @@
 import functools
 import ipaddress
 import re
+import shlex
 import socket
 import sys

@@ -269,6 +270,18 @@ def parse_time_interval(value: str) -> float:
     return _parse_units(value, _time_units, 'time interval')

+def split_args(command: str) -> Sequence[str]:
+    """Split a command string into a list of arguments"""
+
+    lex = shlex.shlex(command, posix=True)
+    lex.whitespace_split = True
+
+    if sys.platform == 'win32': # pragma: no cover
+        lex.escape = ''
+
+    return list(lex)
+
+
 _ACM = TypeVar('_ACM', bound=AsyncContextManager, covariant=True)

 class _ACMWrapper(Generic[_ACM]):

On non-Win32 systems, the behavior of supporting backslash as an escape character remains enabled (for backward compatibility). However, on Win32, backslash escaping is turned off to avoid breaking windows pathnames. You can still embed quotes in a command argument by putting single quotes inside of double quotes or vice-versa. However, attempting to backslash-escape quotes (or backslashes themselves) will be ignored on Windows.

By design, this doesn't match OpenSSH escaping exactly. However, the cases where it doesn't match up are cases where backslashes were consumed as escape characters, which generally breaks things on Windows when it happens.

I'm tempted to disable the backslash escaping feature for ProxyCommand on all platforms rather than just Win32, but I could imagine someone potentially wanting to escape a quote and being surprised if that didn't work. Unfortunately, on Windows, there's really no way to know the caller's intention when hitting a backslash followed by a quote, so this code just disables backslash escaping there and requires using quotes within opposite quotes if you ever needed a literal quote in a command.

As part of this change, I made the SSHConfig code always return a non-split string to the main options processing in the case of ProxyCommand, and any splitting will be done in the options processing, regardless of whether it comes from config or as an option. This lets me continue to support normal POSIX escaping when splitting args for all other config file arguments.

ronf commented 1 month ago

This is now available as commit 4f3de9e in the "develop" branch. Let me know if you run into any problems with it.

ronf commented 3 weeks ago

This change is now available in AsyncSSH 2.16.0.