ronf / asyncssh

AsyncSSH is a Python package which provides an asynchronous client and server implementation of the SSHv2 protocol on top of the Python asyncio framework.
Eclipse Public License 2.0
1.55k stars 152 forks source link

Jumphost end target host failing after publickey auth, never tries password auth #301

Closed carlmontanari closed 4 years ago

carlmontanari commented 4 years ago

Hi Ron,

Not 100% sure if this is a bug or just a bad ssh implementation on a router or me doing something wrong, but figured I'd raise an issue just to let ya know about this.

The scenario is I am running a script from my local machine through a jump host. The connection to the jumphost works as expected, and the subsequent connection to the target host begins as normal, however if I use password auth the connection fails. If I use public-key auth, the connection succeeds. I spent some time poking around and found that the target device (Cisco device running IOSXR) does indeed return a list of allowable auths w/ passowrd auth in it... here is the auth list I get from the device:

[b'gssapi-keyex', b'gssapi-with-mic', b'hostbased', b'publickey', b'keyboard-interactive', b'password']

^ snagged this just by dropping a print(self._preferred_auth) on line ~2540 of connection.py.

Here is debug level 1 output of connecting and showing the public key auth fail and then the connection giving up:

DEBUG:asyncssh:[conn=0] Received key exchange request
DEBUG:asyncssh:[conn=0] Beginning key exchange
DEBUG:asyncssh:[conn=0] Completed key exchange
INFO:asyncssh:[conn=0] Beginning auth for user USERFORJUMPHOST
DEBUG:asyncssh:[conn=0] Trying public key auth with rsa-sha2-256 key
DEBUG:asyncssh:[conn=0] Signing request with rsa-sha2-256 key
INFO:asyncssh:[conn=0] Auth for user USERFORJUMPHOST succeeded
INFO:asyncssh:[conn=0] Opening SSH connection to TARGETROUTER, port 22 via JUMPHOST
INFO:asyncssh:[conn=0] Opening direct TCP connection to TARGETROUTER, port 22
INFO:asyncssh:[conn=0]   Client address: dynamic port
DEBUG:asyncssh:[conn=0, chan=0] Set write buffer limits: low-water=16384, high-water=65536
DEBUG:asyncssh:[conn=0] Received unknown global request: hostkeys-00@openssh.com
INFO:asyncssh:[conn=1] Connection to TARGETROUTER, port 22 succeeded
INFO:asyncssh:[conn=1]   Local address: 172.31.254.76, port 57180
DEBUG:asyncssh:[conn=1] Requesting key exchange
DEBUG:asyncssh:[conn=1] Received key exchange request
DEBUG:asyncssh:[conn=1] Beginning key exchange
DEBUG:asyncssh:[conn=1] Completed key exchange
INFO:asyncssh:[conn=1] Beginning auth for user USERFORROUTER
DEBUG:asyncssh:[conn=1] Trying public key auth with ssh-rsa key
DEBUG:asyncssh:[conn=1] Signing request with ssh-rsa key
INFO:asyncssh:[conn=0, chan=0] Aborting channel
INFO:asyncssh:[conn=1] Connection lost
INFO:asyncssh:[conn=0] Closing connection
INFO:asyncssh:[conn=0, chan=0] Closing channel

With publickey auth, this works as you'd hope:

DEBUG:asyncssh:[conn=0] Requesting key exchange
DEBUG:asyncssh:[conn=0] Received key exchange request
DEBUG:asyncssh:[conn=0] Beginning key exchange
DEBUG:asyncssh:[conn=0] Completed key exchange
INFO:asyncssh:[conn=0] Beginning auth for user USERFORJUMPHOST
DEBUG:asyncssh:[conn=0] Trying public key auth with rsa-sha2-256 key
DEBUG:asyncssh:[conn=0] Signing request with rsa-sha2-256 key
INFO:asyncssh:[conn=0] Auth for user USERFORJUMPHOST succeeded
INFO:asyncssh:[conn=0] Opening SSH connection to TARGETROUTER, port 22 via JUMPHOST
INFO:asyncssh:[conn=0] Opening direct TCP connection to TARGETROUTER, port 22
INFO:asyncssh:[conn=0]   Client address: dynamic port
DEBUG:asyncssh:[conn=0, chan=0] Set write buffer limits: low-water=16384, high-water=65536
DEBUG:asyncssh:[conn=0] Received unknown global request: hostkeys-00@openssh.com
INFO:asyncssh:[conn=1] Connection to TARGETROUTER, port 22 succeeded
INFO:asyncssh:[conn=1]   Local address: 172.31.254.76, port 57292
DEBUG:asyncssh:[conn=1] Requesting key exchange
DEBUG:asyncssh:[conn=1] Received key exchange request
DEBUG:asyncssh:[conn=1] Beginning key exchange
DEBUG:asyncssh:[conn=1] Completed key exchange
INFO:asyncssh:[conn=1] Beginning auth for user USERFORROUTER
DEBUG:asyncssh:[conn=1] Trying public key auth with ssh-rsa key
DEBUG:asyncssh:[conn=1] Signing request with ssh-rsa key
INFO:asyncssh:[conn=1] Auth for user USERFORROUTER succeeded
DEBUG:scrapli.transport-TARGETROUTER:Authenticated to host TARGETROUTER with public key auth

I've got this working by adding the following in connection.py around line ~2525 to just skip straight to password auth since thats what I want to use in this scenario:

if self._host == "TARGETROUTER": options.preferred_auth = ("password",)

I'm generally using asyncssh with scrapli, but this issue occurs w/out scrapli as well. Here is a simple script example:

import asyncio
import logging

import asyncssh

logging.basicConfig(filename="asyncssh.log", level=logging.DEBUG)
logger = logging.getLogger("asyncssh")
asyncssh.set_debug_level(1)

# ssh config file is sorting out key for the jumphost
jumphost_args = {
    "host": "JUMPHOST",
    "username": "USERFORJUMPHOST",
    "known_hosts": None,
    "agent_path": None,
}
target_router_args = {
    "host": "TARGETROUTER",
    "username": "USERFORROUTER",
    "password": "PASSWORDFORROUTER",
    "known_hosts": None,
    "agent_path": None,
}

async def main():
    async with asyncssh.connect(**jumphost_args) as jumphost:
        async with jumphost.connect_ssh(**target_router_args) as target_router:
            stdin, stdout, stderr = await target_router.open_session(
                term_type="xterm", encoding=None
            )
            stdin.write(b"show run | i exec-timeout\n")
            result = await stdout.read(65535)
            print(result)

if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

And here is a level 2 log output from it failing. Not sure what I need to hide in level 3 so this'll have to do for now.. but yeah basically just seems to decide that public key auth is no good and it should just close the connection.

INFO:asyncssh:[conn=1] Beginning auth for user USERFORROUTER
DEBUG:asyncssh:[conn=0, chan=0] Sending 36 data bytes
DEBUG:asyncssh:[conn=0, chan=0] Sending 68 data bytes
DEBUG:asyncssh:[conn=0, chan=0] Received 84 data bytes
DEBUG:asyncssh:[conn=1] Remaining auth methods: password,publickey,keyboard-interactive
DEBUG:asyncssh:[conn=1] Preferred auth methods: gssapi-keyex,gssapi-with-mic,hostbased,publickey,keyboard-interactive,password
DEBUG:asyncssh:[conn=1] Trying public key auth with ssh-rsa key
DEBUG:asyncssh:[conn=0, chan=0] Sending 36 data bytes
DEBUG:asyncssh:[conn=0, chan=0] Sending 628 data bytes
DEBUG:asyncssh:[conn=0, chan=0] Received 580 data bytes
DEBUG:asyncssh:[conn=1] Signing request with ssh-rsa key
DEBUG:asyncssh:[conn=0, chan=0] Sending 36 data bytes
DEBUG:asyncssh:[conn=0, chan=0] Sending 1156 data bytes
DEBUG:asyncssh:[conn=0, chan=0] Received EOF
INFO:asyncssh:[conn=0, chan=0] Aborting channel
INFO:asyncssh:[conn=1] Connection lost
INFO:asyncssh:[conn=1] Aborting connection
INFO:asyncssh:[conn=0] Closing connection
INFO:asyncssh:[conn=0, chan=0] Closing channel
INFO:asyncssh:[conn=0] Sending disconnect: Disconnected by application (11)
INFO:asyncssh:[conn=0] Connection closed
INFO:asyncssh:[conn=0, chan=0] Closing channel due to connection close
INFO:asyncssh:[conn=0, chan=0] Channel closed

Last thing to note -- connecting from the jumphost to the router directly works (no modifications necessary, works w/ password or key).

This is all running on my mac locally with Python 3.8.5 and asyncssh 2.3.0. The jumphost is ubuntu 16.04, Python 3.6.11 and asyncssh 2.3.0 if that matters at all, and like I said the target router is Cisco IOSXR 7.0.1.

I can do further testing/troubleshooting/log gathering this weekend if there is anything else you'd like to see!

Phew... hopefully that is all coherent :D!

PS - the ssh config file support added recently is quite nice :D!

PPS - I did a bit of searching through issues but didn't see anything definitive: is there an obvious way to handle 2FA -- or better yet, to re-use existing control persist sessions to alleviate that need? For testing some things I've just set my user to bypass 2FA on my lab host (easy because lab, so no big deal), but it may be a bit of a challenge going forward with some stuff I've been working on.

PPPS - seems like the same result as #299 in that we get the connection close but no real reasoning why...

Carl

ronf commented 4 years ago

Hi Carl,

I'm a bit confused by your first two logs. Both of them appear to be public key auth, and one succeeds and the other fails. What was the difference in your client configuration in these two cases? You mention password auth, but I don't see that being attempted in either log. Were you expecting the first one to fall back to password auth after public key auth failed? If so, you might be running into a problem with the number of auth attempts the target host allows. Even with OpenSSH, I often see auth failures on clients which have a lot public keys loaded, as all the auth attempts get used up on those and it never gets a chance to fall back to other auth methods like password. Setting preferred auth can help with that.

There's no need to hack the code to set preferred auth, if you're running 2.3.0. That now supports a preferred_auth argument to connect(), or you can set it in the config file using the PreferredAuthentications option.

It's interesting that you see a different result going through the jump host vs. not. Are you attempting to use the same keys in both cases on the initial public key attempt?

Glad you like the config file support! It was one of the bigger pieces of work, but I do like how it came out, and how easy it is now to add new config options corresponding any new connection options I add. It's literally just a few lines of code.

Unfortunately, control persist relies on mechanisms to pass file descriptors between processes using UNIX domain sockets, not to mention some undocumented OpenSSH internals. I don't see being able to add support for the OpenSSH version of that. However, AsyncSSH natively allows very good connection sharing within a single event loop, giving you more or less the same result if you open all the connections from a single invocation of Python. You could probably also build something that left around a Python process with open AsyncSSH connections that listened for requests from other processes to open new sessions on those connections. That's essentially what OpenSSH is doing, but it's also passing in things like stdin/stdout from each of the new processes, which may not make sense in the AsyncSSH case since its sessions are typically not being driven by user input.

Regarding 2FA, AsyncSSH should be able to handle "partial success" during auth, where it challenges you multiple times (like both public key and password). Also, some forms of 2FA rely on keyboard-interactive authentication to challenge the user, and that's supported by AsyncSSH as well, though you'd need to provide callbacks to properly respond to the series of challenges.

Regarding #299, it's possible these are similar, but all we can really say they have in common is that the server is choosing to close the connection before auth completes. We don't know if the reason for that is the same or not.

carlmontanari commented 4 years ago

Sorry -- I was in the weeds with troubleshooting when I posted and probably was not clear :) !

The first two debugs look like they don't try password auth -- because they don't (or at least I don't see it happening)! Thats the problem I believe. In other words, the first debug log above I have supplied user/pass and hope that that will work (however it fails after public key auth fails), in the second example authentication works as desired because I have supplied user/plublickey (and it does work).

By "bypassing" the authentication list to only supply "password" as a valid auth mechanism the connection works. Running the same script (sans the proxy of course) from the jumphost requires no modifications.

The "hack" for the preferred auth is/was because I have not supplied any preferred auth list, so it is just whatever the device suggests (which is public key + keyboard interactive + password), but the connection fails after the public key. By removing public key from the list of preferred auth options manually, it(authentication) works because it tries the password I provided and succeeds.

Hopefully that is more clear. If not I'm happy to retry things and try to collect my thoughts again and rephrase! And to answer the question about public keys on the jumphost vs "through" the jumphost -- yep, everything is the same -- same keys on the host as on my local machine, same key being pointed to in config file, etc..

Super interesting re the control persist, I think that all makes sense but some of it is a bit over my head!

2FA stuff is intriguing -- will have to play around this weekend! And totally understand regarding the other issue, just felt like maybe similar situation!

Thanks for all the help!

Carl

ronf commented 4 years ago

The "hack" for the preferred auth is/was because I have not supplied any preferred auth list, so it is just whatever the device suggests (which is public key + keyboard interactive + password), but the connection fails after the public key. By removing public key from the list of preferred auth options manually, it(authentication) works because it tries the password I provided and succeeds.

Understood. I was just suggesting that you could work around this problem by passing in preferred_auth='password' if you know you want to use only password auth for a given request alongside passing in username/password, with no changes needed in AsyncSSH itself.

It's definitely curious that it would fail after trying public key auth only in the jumphost case and not in the direct case. That doesn't make a whole lot of sense. If you manually log into the jump host and run a non-proxied connection to the device with the same arguments, does it succeed or fail in that case?

carlmontanari commented 4 years ago

Ah yep makes sense about passing in preferred_auth! Was partly from laziness/partly that I haven't "exposed" that via scrapli but yeah that totally makes sense and can fix that up in my script.

Correct -- same script from the jump host (obviously non-proxied) succeeds (no modifications) -- it works with password or publickey auth from the jump host.

ronf commented 4 years ago

I guess the next step might be to grab a level 3 log for both the working case from the jump-host going directly and for the broken case where you are proxying through the jumphost from another system using the exact same options and credentials. As long as you don't include any passwords in the options, the level 3 logs should be safe if you are comfortable with the public keys showing up there. The private keys NEVER show up on the wire, so you don't need to worry about that.

If you can set up additional keys trusted on the device before running the test, you could even create a throw-away keypair just for this that you remove afterward, to be extra safe.

If you are using RSA keys (which it looks like you are), you might want to try ECDSA or Ed25519 to see if that makes any difference. RSA keys have the extra complexity of multiple signature algorithms which can sometimes lead to problems you don't see with other key types.

It would also be good to see what the server advertises as supported signature algorithms, if it supports that extension.

Of course, none of this should be any different when going through the jumphost vs. not, but clearly we're hitting some kind of edge case.

carlmontanari commented 4 years ago

Copy that. Logs are no problem and I should be able to try a non RSA key easy enough I imagine. I'll get this tested and report back with results this weekend! Thanks a ton Ron!

Carl

carlmontanari commented 4 years ago

Ugh, sorry for the confusion! It seems I was mistaken before -- attempting to authenticate from the jumpiest with password DOES indeed fail.

So net/net is that password auth to the device is failing -- and it appears that this is a Cisco IOSXR specific thing because connecting to a Juniper box in the same fashion works. Setting the preferred_auth does "fix" this, but I was still trying to find out why...

Digging a bit more I found logs on the IOSXR device and it basically mirrored what we see in the asyncssh logs -- tries public key auth then stops. Its still not making a lot of sense to me, because the Juniper connection clearly works... but it did get me looking at my ssh config file where I have a key/user/identities only configured...

I figured the identities only argument was probably a bad idea given that I was trying to use user/password (still doesn't make sense why Junos works...) so I commented that out for the relevant entry, still things failed in the same fashion. Interestingly commenting out the IdentityFile argument did work (even if IdentitiesOnly was left in the config!). With that argument commented/removed the connection works directly from the jump host and from my Mac via proxy as desired.

Why it works for Junos but not with the Cisco box I have absolutely no idea -- the section fo the config file that I was editing covers both the hosts (and in the jump host case there is only one section covering * hosts in my config file), so I would have thought that things would behave the same for both device types, but I guess not!

In any case, I think the mystery is solved and the combo of making a config file that behaves and/or passing in preferred_auth should sort me out so I'm happy to close this out.

If you are still curious, here are logs from my Mac (the proxy_jump one) and from the jump host itself (the jump host one of course!) -- this is "pre ssh config file changes":

proxy_jump_asyncssh_test.log jumphost_asyncssh_test.log

If you're interested, I can email you the logs from the password attempts if thats ok -- I know I got the username redacted but guessing the password is floating around in there somewhere.

Thanks a bunch, have a great weekend!!

Carl

ronf commented 4 years ago

Hi Carl,

Glad you figured things out!

Right now, IdentitiesOnly is not implemented in the AsyncSSH config file support, so having that there would be ignored by AsyncSSH. It will honor IdentityFile, though, if you don't set client_keys explicitly in your connection options. If either of these is set, it will prevent the default keys in your ~/.ssh directory from being loaded. Could that have anything to do with what you're seeing? If you had a large number of keys, you might be exhausting all the auth attempts before you got a chance to fall back to password authentication.

Do you know if the Cisco device has a way to configure the maximum number of auth attempts (equivalent of OpenSSH's MaxAuthTries in sshd_config)? If you want to fall back from public key to password without setting preferred_auth to try password first, you'll need to make sure that's set large enough to allow that.

From what I can see in the two logs you posted, the auth succeeds on the very first key it tries in both logs. With the proxy log, there are two separate auth attempts as expected, and both succeed on the first public key tried.

carlmontanari commented 4 years ago

I only have the one key on the jump host for testing, so I don't think I have any issues with too many keys/attempts (and logs would seem to agree with that). So I'm going to chalk it up to the Cisco device being a bit "different" for now and deciding to quit at the first failed attempt. At the very least thats "good enough" for now!

ronf commented 4 years ago

Keep in mind that AsyncSSH is only running on the client machine, so private keys on the jump host don't come into play here at all. Both the auth to the jump host and the auth to the final target machine is being done by AsyncSSH code running on the client, so that's where the private keys will need to be for both authentications.

I could believe that the Cisco device might be only allow one auth attempt based on some of what we've seen here. So, if the public key being provided by the client is not allowed, the Cisco device could just immediately fail before allowing password auth to be attempted. I don't know if that's configurable or not, though.