Open bitprophet opened 10 years ago
Another good example of this hilarity is a recent test I added that attempts to prove a negative case of multiple key handling; when the test server is configured to only accept ECDSA, and only an RSA key is given to connect()
, it (on my Mac localhost) raises a PasswordRequiredException
(odd, but a known thing). So far so good, self.testRaises(PasswordRequiredException, ...)
, done. Right?
Except on Travis-CI's VMs, it raises SSHException
straight up, which fails the test. And on a random Ubuntu 13.10 VM I had lying around, the same test raises AuthenticationException
! This sort of system-sensitive bullshit is probably one reason why we get so many related looking but different-traceback reports :(
I started prep work for this while dealing with #243, insofar as I added a bunch more log output (DEBUG) to _auth
so at least it's easier to trace how a given auth process worked, what errors got raised where, etc.
I also started factoring out some common patterns (nearly every step has the same try
/except
, auth, check for two-factor, return-or-keep-going, process) though sadly the heavy reliance on variables "falling through" requires some real dumb argument passing & return value stuff. Should hopefully still be worthwhile when done.
However this is turning into a morass so I am punting for now. Left the work in a branch named for #243.
One approach to starting to debug it:
1) change saved_exceptions to an array, i.e.
saved_exceptions = []
...
saved_exceptions.append(e)
2) Do something like this so you get tracebacks for all exceptions.
3) replace "except Exception" with lists of explicit exception types (maybe by concatenating lists of common ones)
Any thoughts before I give this a go?
So something like this
import traceback
import sys
saved_exceptions = []
for c in "hello":
try:
int(c) # obviously fails
except ValueError:
saved_exceptions.append(sys.exc_info())
if saved_exceptions is not None:
raise Exception("\n\n".join(
''.join(traceback.format_exception(t, e, tb))
for t, e, tb in saved_exceptions) )
I'd use SSHException instead of Exception.
I face this issue yesterday and find a workaround solution with another script expect -c "set timeout -1; spawn -noecho ssh -o StrictHostKeyChecking=no $2 ${@:3} exit; expect assword:; send -- $1\r; interact;"; the parameter StrictHostKeyChecking=no is the key arg
hope author can combine this solution to paramiko new release
thx
This is causing a multitude of questions about "Protocol error: expected packet type 50, got 5" when trying to connect to many IOS-based Cisco devices. The "service-request" dance gets restarted for a second try and the remote end doesn't expect this. People suggest fixing this by setting "allow_agent=False, look_for_keys=False" but that's only a workaround for people who don't use ProxyCommand and/or intend to try authenticating with both key and password auth.
Seealso #519
Got payload (44 bytes, 8 padding) Read packet
, length 35 Authentication type (publickey) not permitted. Allowed methods: [u'keyboard-interactive', u'password'] Write packet , length 17 OUT: 00 00 00 1C 0A 05 00 00 00 0C 73 73 68 2D 75 73 ..........ssh-us OUT: 65 72 61 75 74 68 40 D1 E5 F9 35 E4 9E 2B E2 0E erauth@...5..+.. IN: 00 00 00 4C 10 01 00 00 00 02 00 00 00 2E 50 72 ...L..........Pr IN: 6F 74 6F 63 6F 6C 20 65 72 72 6F 72 3A 20 65 78 otocol error: ex IN: 70 65 63 74 65 64 20 70 61 63 6B 65 74 20 74 79 pected packet ty IN: 70 65 20 35 30 2C 20 67 6F 74 20 35 00 00 00 00 pe 50, got 5.... IN: 97 D3 87 1C 13 02 C6 E2 4D E1 4F 5F 67 B0 12 3C ........M.O_g..< Got payload (76 bytes, 16 padding) Read packet , length 59
Briefly poking this now since it's getting in the way of debugging other (new feature related, #827) failures. Not going to commit anything to master but just feeling out the space as I slap something together to help me troubleshoot.
Overall exploration:
SSHException
which is complete horseshit and something that will be fixed at the same time, via #258):
AuthenticationException
) meaning a key was successfully instantiated, but was rejected by the server. This is much higher-signal than the others but is still not always necessarily the "right" error to surface.AuthenticationException
...)EOFError
exceptions) which occur when the channel gets unexpectedly closed before all sources have been tried.SSHException
subclass and then we can opt to release it in a 2.x backwards compat fashion after all, if the new behavior seems stable enough.aggregate_exception['agent'][fingerprint] = <exception>
or a_e['key_filename'][filename] = <exception>
)BadKeyMaterial(klass=RSAKey, filename='/path/to/rsa.key')
, then store all such exceptions in a flat list (ordered by when they were tried, of course).BadKeyMaterial(type=RSAKey)
vs BadRSAKey
) is probably not super important?I put together some work leading toward making a cleaner pass through the authentication morass, a bit more in line with the intent of the spec. Initially was going to put in a bunch of extra effort to preserve backward compatibility, but if this can be a 3.0 thing, it may help a ton to scrap a lot of the bizarre-O logic that is in place to make things work currently.
The primary hurdle that seems to be in place, is that the (client) authentication is driven by a user thread sending messages, but has to coordinate with the transport thread fielding responses and indirectly processing work via the AuthHandler callbacks. I patched a bunch of these callbacks to replace the inline work with a Queue.put()
of the message and type, making the authenticate thread solely responsible for dealing with the authentication flow. If you can drop backward compatibility, then all message types in [50..69] from the transport thread should probably be just put on the AuthHandler queue, and drop the callback dict altogether.
See: https://github.com/radssh/paramiko/commit/3757bac2ee368d91e559787dd9b49701c8be717f I did not add too much polish, but should serve as an example, without a full retrofit that may not be needed (or desired).
The secondary redesign is to make a single authenticate_client()
call that is fed various iterators that supply the usable passwords/keys, and preferred authentication methods. gssapi_keyex and gssapi-with-mic methods seem to not have any user-supplied parameters (nor would host-based, if that's on a future enhancement list), so it would seem that these would not need iterators, just inclusion in the preferred methods list/tuple. Password(s) could be included and attempted as passwords or keyboard-interactive responses, and PKey(s) would seem to be able to be offered, whether originating from local files, ssh-agent, or PKCS11 sources, or a combination .
The implementation also retains lists of the attempted authentications to provide tracking of what steps were attempted, and which failed and which failed with partial-success (or the final element in the list completed the authentication successfully, in the case of multi-factor). This approach seems to eliminate the awkwardness of the saved_exception construct from the old authentication, allowing an arbitrary length trail of the events that were triggered along the way.
It would seem that there would not likely be a need to deal with various AuthenticationException occurrences, other than server response Timeout - the transport either reaches a successful authenticated state, or failure will allow the list(s) to be used as post-mortem. It would also be helpful to permit followup calls to authenticate_client()
and avoid re-send of the request_auth message, since some servers (Cisco) barf and drop connection when the client attempts to start over from the very top. Ideally the authentication dialog can keep going until the server gives up (too many auth attempts, or empty supported methods list), or the client runs out of things to try.
I have some thoughts here, but I don't use any of the more complex methods.
I think inside the "aggregate auth exception" should be a list of exceptions, not a hierarchy. I'd guess that typically most of these sub-exceptions will be "could not decrypt keyfile" or "auth key was rejected". I agree the exception should have a property of which file path it is related to, or possibly type and fingerprint if it is from the agent.
It might be useful to think of the explicit and implicit auth methods separately. There are already options allow_agent
and look_for_keys
that you can set to False to disable the implicit methods. I think that in many cases where one of the explicit methods has been enabled/passed, the implicit methods were not intended to be used. The excess failures can be confusing, and can also cause ssh servers to close the connection. I guess tools using paramiko could already make it easier/default to disable these though.
The password is the most tricky case. If password is specified, it's used to decrypt specified keys, decrypt discovered keys, for password auth, for second factor I think ... and then, if everything fails, interactive auth asks for another password. Maybe auth_key_password
, or interactive
, should be separate parameters. I really don't have any good/strong ideas though.
@radssh Nice, thanks! (I notice that's a month or so old, was that posted for some other ticket?) Good possible jumping off spot; I specifically like the use of a queue & the explicit authentication-types ordering (& then relying on that for 'continue-with' second factors).
Re: whether this level of rework (or less, or more) is appropriate - depends a lot on whether I want to try getting some variant out as e.g. 2.4. I think I can get out something backwards compatible, in which case we'd be open to do a larger architectural rework in 3.0...we'll see.
Re: @ploxiln:
should be a list of exceptions, not a hierarchy
Yea, afterwards I came to the same conclusion, the order-of-attempt needs to be preserved regardless. I might still like a "both" option where one can very quickly see in a repr() that e.g. 4 agent keys were tried, or that 1 key_filenames key was tried and it seemed to load OK as an RSA key; but where the actual order is still available somewhere.
The 'rich exceptions' approach still seems best, esp since you can get the above value by viewing the whole list and/or doing things like [x for x in exceptions if isinstance(x, BadKeyMaterial)]
.
I think that in many cases where one of the explicit methods has been enabled/passed, the implicit methods were not intended to be used.
Absolutely true. I'm guessing one reason for the current hodgepodge is that it's, sadly, how OpenSSH behaves by default - you have to manually set ssh_config values somehow, or use other tricks like temporarily unsetting SSH_AUTH_SOCK
, if you want them disabled - even if you're e.g. doing ssh -i <explicit key>
.
So going by our normal "OpenSSH is king" approach we'd perhaps never want to fully disable the implicitness; but we should do everything we can do make it easy to "do the right thing", which offhand does mean an easy method of turning off all implicit stuff, a way of configuring that permanently, and clear/visible documentation that it's even a thing to be aware of.
(The configuration angle is harder for pure Paramiko since it only wants ssh_config
, but as you noted, libraries one layer up frequently have their own config options.)
Maybe auth_key_password, or interactive, should be separate parameters.
This has felt like a nasty kludge forever so yes, I'd love to explicitly separate them. This is one spot where a true change requires a 3.0 but we can at least add a 2nd, new kwarg in 2.x - I did the same thing for Fabric recently too (Fabric 1.x gained a sudo_password
option so one can disambiguate login from sudo password values.)
@bitprophet - This was not linked to any ticket/PR, I had a period of ambitiousness in trying to address the general flow of authentication. It stalled out prior to making the backward compatibility patchwork (and the GSSAPI authentication that I don't have a core understanding of how it is supposed to function, and no environment to test that aspect, making me really hesitant). Looking at the extra baggage to maintain the backward compatibility, I just shelved the work as-is. With the opportunity (and desire) to overhaul the API for a possible 3.0, that re-sparked the interest in at least sharing what I had as a possible path to follow. Retrofitting the backward compatible pieces would not be a major addition, except for possibly the 2FA shenanigans - it also would defer the jettisoning of lots of old code that is just yearning for the scrap heap.
If you want to wrench this into a 2.4 release, I will see if I can fold the GSSAPI steps in, and patch the backward compatibility loose ends, and put a bow on a PR, then sharpen the machete to chop away a bunch of stuff for 3.0.
In my head the 2.4 stuff is actually even simpler than what you've got there, it's really just replacing all the saved_exception = e
with an aggregate exception object (new SSHException subclass), then raising that at the end. This would allow users to gain a lot more insight about failure modes, with a minimum of instability.
(There are a couple minor wrinkles as outlined above, re: actual format of that object; and I also think it may need to come in two flavors so any AuthenticationException members cause the aggregate object to also be an AuthenticationException in case someone is explicitly checking for that subclass; but that all still feels pretty tightly knit to me.)
I'll probably execute on this soon since I need it for Fabric 2 (lest Fabric 2 repeat too many of Fabric 1's sins regarding how it handles passwords & key decryption.)
When (paramiko) 3.0 rolls around we'd look at your patch & figure out whether that's the right approach or if we want something even bigger.
Backwards-compatibly fixing one aspect of this right now: adding a passphrase=
kwarg to SSHClient.connect
so that Fabric 2 can safely offer correctly split-up CLI options, config params, etc, for password vs passphrase stuff.
Users will still be able to cram either/or into just the password
field, but at least now the option's there to give both (and then the auth fixes I'm about to execute on, will be able to provide that much clearer a signal if the user provided passphrase
explicitly.)
Then in Paramiko 3.0 we'll be able to stop reusing password
altogether. (And that's if we don't come up with a somewhat better interface for both types of values...)
High level code-org thoughts as I look at adding stub tests re: new changes:
test_auth.py
and test_client.py
, with the former tickling some (but not all...) Transport.auth_*
methods and the latter is of course only doing things via SSHClient.connect
.
test_auth.py
:test_client.py
:test_auth.py
into test_transport.py
tho?Have some DDD-level thoughts about this in my local repo and about to take a stab at implementing it:
SSHClient._auth
approach for now, with the assumption that the low level guts in Transport
and AuthHandler
don't themselves require fixing in order to arrive at a cleaner user-facing flow.SSHClient
(and which could be used standalone if desired) for its auth step.Authenticator
, would have two APIs:
AuthSource
(or w/e) subclass instances, explicitly to be tried in that order. E.g. Authenticator(transport).authenticate(PrivateKey('foo.key'), AuthAgent(...), PasswordAuth('secret'))
to try a private key first, then to try keys from an agent, then finally password authSSHClient.connect
, which is actually just a bridge to the other API, implemented via a mostly-internal method - which gets its own unit tests proving that e.g. Authenticator(transport).auth_with_kwargs(password='secret', key_filename=['foo.key'], allow_agent=True)
just results in the posargs shown in the previous snippet.SSHClient._auth
, allowing the newer logic (& semi-new exception behavior) to be invoked via a familiar interface.I finally looked over @radssh's branch about this in more detail (and to see how much of it I was possibly reinventing).
AuthHandler
itself, which is an interesting choice. I assume in part for backwards compatibility?Transport
's individual auth methods are more or less correct as-is -- am I overlooking something important?authenticate_client
is something I'll end up wishing I had access to at the high level.The good thing is, I think the two approaches aren't mutually exclusive, and so I'm going to take a stab at the high level first, with option to modify AuthHandler
later with some of these other ideas if they seem necessary.
Started with the code in AuthHandler
, primarily because I don't make use of SSHClient
, which apparently puts me in the minority. Whatever the long term resolution winds up being, it seems natural to have it reside in AuthHandler
proper, and SSHClient
(or Transport
) adapt accordingly. It was interesting to find in the code history, the original evolution of making AuthenticatedTransport
a subclass of Transport
, before switching over to the dedicated AuthHandler
class.
Haven't had the opportunity to code more down that path, even with the liberty to scrap the backward compatibiilty, but I have been solidifying some more ideas that might be worthwhile...
AuthHandler
constructor. Server (OpenSSH) implementation does not let you change username during the course of authentication, so it doesn't seem to make much sense allowing a change, or for having to repeat it for each authentication flavor submethod.AuthHandler
keeps internal lists (initally empty) of password(s) and key(s) for authentication attempts. Possibly a (empty) dict for keyboard-interactive {prompt: reply, ...}, and some flag/settings for GSS and host-based authentication. An initial sequence of (auth-none) of preferred authentication types.AuthHandler.add_password()
method to supply a password, that appends to the internal password list, and if absent, append 'password' to the preferred authentication.AuthHandler.add_PKey()
to supply a PKey to try, with similar bookkeeping as add_password()
. Require PKey parameter, and not filename to avoid adding complexity.auth_gssapi_with_mic
with its option set, auth_gssapi_keyex
(no options?) and possibly hostbased
(currently not available in Paramiko), that capture the specifics of what parameters may be needed at the time the authentication is offered. Call to these methods would include the side effect of appending to the preferred (available) authentication types.authenticate_client()
, making it a pull-the-trigger option with any/all of the bullets loaded from the AuthHandler.add_*
calls made, since all of the parameters currently listed would be preloaded into the AuthHandler
object. The internal flow should be similar, and properly support the matching of the server side "authentications that can continue" versus the client "preferred/available authentication methods", and correctly handle multifactor authentication, as long as the set of factors were loaded from previous add_* calls, and not dependent on ordering.SSHClient
class, or change the SSHClient.connect()
call to take a single AuthHandler
object instead of the dozen or so parameters that drive the authentication choices today.Wish I had the opportunity to code more of this to completion, as I suspect this description might not be grokked as well as tangible code examples. Trying to find that sweet spot in between OK and TL;DR :)
@radssh Thanks for the additional thoughts!
Being hip deep in the tests (& related support tasks like code cleanup) for all this is making it even more clear there's a lot of responsibility bleed between SSHClient, Transport and AuthHandler. Including but not limited to:
Transport
is not terrifically useful without AuthHandler
, and vice versa, which is always a bad sign...SSHClient.auth
...except for a bunch of random bits & bobs in the other two classes, like how Transport.auth_password
quietly falls-back to keyboard-interactive
when password-type auth isn't supported by the server.connect
method on both SSHClient
and Transport
, the latter of which I wasn't even aware of until recently, and which only implements about 20% of the stuff covered by the former.I think there's a reckoning coming where I upend a lot of these guts and try to figure out if there's a cleaner way to arrange the lines of responsibility. Not happening today (talk about scope creep) but...
@bitprophet - I am revisiting this (now that I reacquired both some spare time AND motivation), and have some progress made on a significantly cleaner fresh pass. Still seems like keeping backward compatibility is way more trouble than its worth. Saw on IRC that you were starting the recoding in earnest, and wanted to again offer cooperative time & effort.
Dropping a note here, as I accidentally discovered what looks like an OpenSSH server bug doing some test case scenarios. Burned a ton of time debugging my own code, then found I could reproduce using OpenSSH command line client...
Following a failed password attempt (with a non-blank incorrect password) with keyboard-interactive (with correct password). On Ubuntu 16.04 (OpenSSH 7.2), the authentication succeeds, then the server immediately drops connection; On OSX 10.13.4 (OpenSSH 7.6), the keyboard-interactive request is sent, but the server never responds with a challenge, and then connection drops after socket timeout.
ssh -v 192.168.1.191 -o NumberOfPasswordPrompts=1 -o PubkeyAuthentication=no -o PreferredAuthentications=password,keyboard-interactive
OSX:
...
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug1: Next authentication method: password
user@192.168.1.191's password:
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug1: Next authentication method: keyboard-interactive
<<< 60 second delay >>>
Connection closed by 192.168.1.191 port 22
Ubuntu 16.04:
...
debug1: SSH2_MSG_SERVICE_ACCEPT received
Ubuntu 16.04.4 LTS
debug1: Authentications that can continue: publickey,password,keyboard-interactive,hostbased
debug1: Next authentication method: password
user@ubuntu's password:
debug1: Authentications that can continue: publickey,password,keyboard-interactive,hostbased
debug1: Next authentication method: keyboard-interactive
Password:
debug1: Authentication succeeded (keyboard-interactive).
Authenticated to ubuntu ([2600:1700:f230:3f60:d5ef:6fa0:5470:456f]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: network
debug1: channel 0: free: client-session, nchannels 1
Connection to ubuntu closed by remote host.
Connection to ubuntu closed.
In both cases, entering an empty password on the first (password) attempt allows the subsequent keyboard-interactive attempt to succeed. Something apparently bleeds over and taints the keyboard-interactive handling if the initial password attempt has failed content...
Will be having a look at the OpenSSH code and bug reports.
Haven't time to glance at the above just yet (busy releasing a 7 year overdue rewrite of a related lib...!) but want to make it real clear that I do have a partly done branch for this. Let me know if you can't find it and I'll make sure it gets pushed soon. Had a bunch of nice tests and DDD written or stubbed out and "just" needed them fleshed out and implemented.
The core setup I have envisioned feels like the Right Way regardless of whether we can shim it into the current high level interface in a backwards compatible fashion, or not.
Note to self in case I don't get back to this for reals in the next few days (need to resist urge, but also, god it's so bad): one aspect of this change, if possible, should be that when all auth fails and we bubble up a newer hybrid/aggregate exception - if it contains a PasswordRequiredException
, that one should be what the aggregate exception "acts like" or otherwise exposes "first".
For example, the scenario I'm running into personally with Fabric 2 as as follows:
~/.ssh/id_rsa.old
in connect(key_filenames=[...])
.passphrase
or password
value (in part because my mental/default workflow for this assumes SSH Agent support)~/.ssh
(e.g. there is no ~/.ssh/id_rsa
etc) - otherwise those might get tried tooIn this scenario, Paramiko's current Transport._auth
ends up doing the following:
RSAKey
, DSSKey
, ECDSAKey
etc, in turn, on id_rsa.old
.RSAKey
yields PasswordRequiredException
(correctly)SSHException('Not a valid XXX key file')
_auth
, eventually trying the two SSH agent keys, each of which yields AuthenticationException('Authentication failed')
(because they're not valid for the target host)AuthenticationException
) to raise to the calling method.What should ideally happen in this scenario is that the logic identifies PasswordRequiredException
as the only actionable exception:
SSHException
usually means something lower level (or rather, we've slowly added more specific subclasses like PasswordRequiredException
and AuthenticationException
...) so not a good candidateAuthenticationException
is a stronger signal (it means we presumably got useful [encrypted-but-unlocked, or unencrypted] key data, which was then rejected by the server), but it's still not as good as PasswordRequiredException
, because we can't know for sure that the entire process would fail if that key had been unlocked.Thus we should raise some form of the password-specific exception to the user; then higher level client code, such as Fabric 2, would have the option of prompting the user (and, further, adding to the agent, if desired) and resubmitting connect
with a passphrase
option.
(In Fabric 1, this "works" because the code assumes any SSHException when using an empty password must be due to a missing passphrase, and prompts the user. This of course has its own pile of false positives in other, equally-likely scenarios.)
It's tempting to slap a band-aid on this and have the various spots that handle SSHException
in _auth
, immediately raise this exception type, however that would be incorrect for a few reasons:
PasswordRequiredException
early in the process (as above) but then to have, say, an SSH agent borne copy of it succeed. In other words, we can't actually know early on if even this strong a signal is truly fatal to the auth process.
Interestingly, I just ran across #1317 posted by @twirrim, where he seems to have a similar set of exceptions but implicitly asserts my above assumption is incorrect.
Specifically, he would have encountered some middle-of-flow AuthenticationExceptions (presumably, since he said he had a key that was rejected by server auth) but then the last exception, and thus the one raised to him, was for an encrypted (and not unlocked) key (thus PasswordRequiredException).
In that scenario, assuming PasswordRequiredException is the "highest signal" exception would have been wrong.
Really, this just proves that the higher level parts of this ticket are more important in all cases: make it much easier to specify exactly what auth methods to try or not try; and make it much clearer in failures what was tried and what the result of each attempt was.
FWIW, the emphasis on "how can we boil everything down to a single exception?" is primarily for backwards compatibility - though even if we went a feature flag or 3.0 route, there's simply no awesome way to square the problem of "tried more than one thing before returning control to the user". We'll always be guessing about which auth source the user really wanted.
Which brings us back to: make it easier for the user to say "I want auth source X and only X for this particular connection". We actually don't truly want the convenience of trying default keys, explicitly given keys and agent keys all the time every time.
Any news on this? Just wanted to try out fabric
and wasted a lot of time figuring out how to use a private key for authentication. It's really hard to grasp it right now :cry:
Any news on this? The most recent release of Paramilko (2.4.2) fails with SSHException's raised with an "invalid key" - boy oh boy would it help to have that message changed to WHERE and which Key impl was throwing it, giving some more specific clues. After a debug session, I've tracked our issue to a too-stringent check in Ed25519Key that seems to fail on older openssh keys after the initial parsing has succeeded. Actual os ssh clients do not fail and the keys work fine with ssh -i
but paramiko barfs and it's been a challenge figuring out a work-around.
This remains one of my highest-priority "really want/need to do it myself, can't just merge a patch" items. I've been quiet because of Life™ but should be back doing OSS things again ~1day/wk from here on. So hopefully I'll get to it once I clear out a bunch of waiting PRs.
I can maybe put some time towards this, if you need extra help. I haven't done a deep dive in to the code any time recently to have a specific idea on a route forwards here, but may be able to investigate.
The best thing I can use is just a sanity check of the API I end up constructing.
I need to reread all this & I have a branch open somewhere (doubtless ancient now), but I had a pretty firm API in mind a few years back that largely entailed an iterable "auth mechanisms to try in this order please" on the input side, and a matching aggregate exception that's raised if auth never succeeds, letting you see the individual results for the parts of that iterable.
So you might say connect(authentications=[PrivateKey(some_rsa, passphrase="butts"), Password("secret"), PrivateKey(some_new_thing), PKCS11(...)])
and then it either succeeds or raises eg AuthenticationException(results=[PrivateKeyFailure("required a passphrase"), PasswordFailure("rejected"), PrivateKeyFailure("no idea what this key type is sorry"), PKCS11Failure(...)])
. (Pseudocode, obvs.)
Question is how far did I get there and what does it look like when I get it "working" again in modern times (and do other invested users see obvious holes in it).
I got a good amount of work in progress on this topic, left off in the middle of a massive overhaul of the unit test cases. Existing auth methods are all implemented, including GSSAPI (finally got a test VM with FreeIPA to kick the tires on that mechanism), and was going to leverage an existing host-based auth PR to round out the suite of mechanisms.
I'll try to put a bit more polish on the current state of things to allow a cursory taste-test on this.
The best thing I can use is just a sanity check of the API I end up constructing.
Can't really say I had the scrutiny of a sanity check, but I was looking for the reason why, when connecting, OpenSSH client would ask me permission to use the private key material of only 1 key (the key that was actually accepted by the server) whilst Fabric would try all keys in the agent.
I ended up wanting to write a patch for whatever comes out of this: https://github.com/paramiko/paramiko/blob/auth-shakeup/tests/authentication.py#L166
OpenSSH first only offers public keys to the server, and only when it's accepted, tries to get the private key information. This means, in your config you can set IdentityFile to a public key, to hint it will should offer that one, and then get the corresponding private part from the agent.
I found the stubs readable and can follow idea of the lifecycle. And I can image where I would need to code the behaviour I miss. So for what it's worth. Good job on the API. I hope Life™ ends some time soon so this can get nudged forward. ;-)
i made #1647 to at least have some diagnostics when paramiko falls over on a key. before, we wouldn't even know which key was a problem:
paramiko.ssh_exception.PasswordRequiredException: Private key file is encrypted
now at least we'd see this:
paramiko.ssh_exception.PasswordRequiredException: Private key file '/home/anarcat/.ssh/id_rsa' is encrypted
Is there some kind of troubleshooting page for these kind of issues?
I have an SSH agent running, SSH_AUTH_SOCK points to it, my key is added to it (and unlocked), and paramiko still throws paramiko.ssh_exception.PasswordRequiredException: Private key file is encrypted
. I have no idea what more to try.
Is there some kind of troubleshooting page for these kind of issues?
I have an SSH agent running, SSH_AUTH_SOCK points to it, my key is added to it (and unlocked), and paramiko still throws
paramiko.ssh_exception.PasswordRequiredException: Private key file is encrypted
. I have no idea what more to try.
I ended up here and realized it was because paramiko isn't using the SSH config's specified username so while ssh some-host-name
works, paramiko isn't using the username from the SSH config, and instead is trying ssh defaultuser@some-host-name
instead of the (implicit) ssh correctuser@some-host-name
.
Digging into this at very long last. After reviewing the related code in both Paramiko and in Fabric (where it obviously comes up a whole lot), I'm left with the following:
I'll be spiking this out over the coming weeks with the intent of:
None
).
Hi Jeff! Really nice to see you working on this!
For me, #1647 was pretty darn useful to diagnose problem, and I still miss it in my day-to-day use of Fabric. I've gotten used to Fabric throwing me a "key is encrypted error" and interpreting this as "oh, I need to plug in my Yubikey" but that's kind of a terrible UX for newcomers...
But yeah, there's a whole other stack of things to improve on top of that... I figured you might like to look at this MR as a good, somewhat small change...
@anarcat Yea, that sort of added visibility is absolutely going to be part of the new exception type/s I'm envisioning.
Still figuring out the details, but at the absolute worst, a caller would be trivially able to get the list of key paths attempted (and how each failed), from introspecting the top level exception in a try/except block.
And offhand I assume the cleanest way to do it is to implement a variant on your PR, that raises more-detailed subclass exceptions which the top-level/aggregate one literally just sticks in a list. This way, lower-level users of Paramiko still benefit, and the higher-level code there or in Fabric doesn't have to do as much work either. 🤔
That does sound much better than the hack I did then. :) Feel free to close the PR...
@bitprophet -
I have been tinkering with the concept of wrapping the entire authentication process within a separate class, primarily driven by a SSHConfig
object, which should a) avoid a bunch of the **kwargs and b) keep things friendly for embedding configs in ~/.ssh/config-like style. Keeping a default IdentityFile
value like OpenSSH and support for IdentitiesOnly
gives the level of control that OpenSSH provides, with least surprise in behavior.
There are also some conditional handling things, like shifting to rsa-sha2-256/512 instead of ssh-rsa when client and server support/prefer/require. Again, the leveraging of SSHConfig helps a great deal with the client-side configuration for this. I did have to retain some session info in the Transport to pull this off, specifically capturing the server-sig-algs message that may be passed during the connection setup.
That being said, I've only focused on the client side of things for authentication (and key handling), the server side I have had the luxury of ignoring for my needs.
There are also some conditional handling things, like shifting to rsa-sha2-256/512 instead of ssh-rsa when client and server support/prefer/require
Weird, how recent is the branch you're working off for this? As of a few revisions ago we should be dealing with server-sig-algs mostly-correctly & in fact most users that run into SHA2 issues are the ones who have to disable it when their targets aren't correctly indicating the lack.
the server side I have had the luxury of ignoring for my needs.
Honestly, same for now. I do want the server to get more compatible sometime too, but it's always a vanishingly smaller userbase vs client so difficult to justify the time until it becomes a self-itch-scratch (eg if/when I need the test harness server to act more OpenSSH-like...).
I'm still kicking this around but I am continuing to think that doing this at the Fabric level makes the most sense. Fabric already autoloads ssh_config, for example, so it just needs additional tweaks to be more OpenSSH-compatible. Doing it there also makes ignoring the server side slightly more understandable as Fabric has no server component 😇
There are a number of issues stemming from what feels like a simply too-naïve implementation of key handling in
SSHClient
andPKey
.I am making a single ticket for this because most of the existing PRs poking at it are too limited in scope; this sort of change has a high chance for bugs and breaking backwards compatibility (intentionally or no) and I feel it needs a broadly considered update.
[EDIT: See this comment for an update on the intent/scope of this issue!]
The high level details:
SSHClient._auth
uses a multi-exit strategy combined with storing a single exception to raise at the end of the process. This frequently means the raised exception at auth time is flat out incorrect as to the true cause of the inability to authenticate.key_filename
values...and since it can't reliably know offhand which key types they are, it tries all known key types. In order. This will always generate/store errors unless one happens to have only one key and it's of the arbitrarily-first-in-the-list key type.PKey._read_private_key
- used frequently within the above, indirectly - is similarly messy, and many, many types of key loading/reading problems end up raising an error about howPrivate key file is encrypted
- on unencrypted files.password
kwarg is blank, that a password was required and not given.A number of tickets have sprung up on this, I'm sure I haven't gotten all of them, but they include #358 and #224.
Strongly related are a bevvy of tickets concerned with exception handling & exception types/subclasses, such as Fabric #85 and our own #351 and #293.