ronf / asyncssh

AsyncSSH is a Python package which provides an asynchronous client and server implementation of the SSHv2 protocol on top of the Python asyncio framework.
Eclipse Public License 2.0
1.54k stars 150 forks source link

GSSAPI limited to single user principal #682

Open zarganum opened 2 weeks ago

zarganum commented 2 weeks ago

GSSAPI can operate with the credentials cache. Each credential cache (ccache) contains credentials for single principal only. It cannot contain credentials for different principals. With Kerberos, a password is exchanged for Ticket Granting Ticket, TGT is put into the ccache, followed by Service Ticket per each service - all will be collected in ccache.

With popular Kerberos libraries (MIT and Heimdal), only one default ccache at a time can be selected in the process using environment variable KRB5CCNAME. It means the ccache is global per process.

Changing global environment variable in asyncio-driven process makes no sense because of potential interference between coroutines. This limits asyncssh to only single principal per process at a time during client connection opening.

The suggested solution (on UNIX) is to pass the store parameter during Credentials class creation. It allows different options, one of particular interest is the dict object {'ccache' : 'TYPE:value'} where TYPE:value is the Kerberos ccache identifier.

Example:

creds = gssapi.Credentials(usage="initiate", store={"ccache": "MEMORY:username"})

or

creds = gssapi.Credentials(usage="initiate", store={"ccache": "FILE:/tmp/username"})

Possible approach:

  1. To add an optional store internal parameter to the GSSClient constructor, save it in the instance and then use each time new gssapi.Credentials object is created.
  2. Extend the options object with gss_store parameter in addition to existing gss_auth, gss_delegate_creds, gss_kex. This does not have corresponding OpenSSH equivalent but can be passed programmatically to connection constructor.
ronf commented 2 weeks ago

Thanks for feature suggestion, @zarganum, and the detailed description of how it could be done!

I experimented this afternoon on this, and I have something working. At first, I had some trouble getting it to run on macOS due to an issue with python-gssapi not building the "store_cred" extension there. However, it worked on Linux and with a bit of hacking on python-gssapi I was eventually able to run there as well.

As you suggested, I added a gss_store option in SSHClientConnectionOptions and SSHServerConnectionOptions. I designed it so that you can either pass in a string (str or bytes) and have it automatically create a dict with a 'ccache' entry to pass to Credentials, or you can pass an already constructed dictionary, if you need to specify in any other store options.

Similarly, you can specify the TYPE as part of the string you pass in, but since the TYPE defaults to 'FILE:', you can leave that out if you like and just pass in a plain pathname as the gss_store value. So, most callers don't really need to know any of the details of what this store argument looks like internally.

I still need to do a bit more work to get code coverage on this new feature in the unit tests, but it's working when I test it manually and all the previous unit tests are working after a few small edits. So, I should have something soon for you to try!

ronf commented 2 weeks ago

A first cut at this is now available as commit 5cbde23 in the "develop" branch. Thanks again for the suggestion!

zarganum commented 2 weeks ago

@ronf, appreciate your swift response very much!

Unfortunately GSSAPI negotiation is a non async-aware network I/O, so ideally the Credentials and/or SecurityContext objects should be cooked in a separate thread and then passed to asyncssh connection with all tickets validated and ready. For example, the ccache may contain only initial TGT but miss the particular ST, GSSAPI fulfills this transparently but without asyncio awareness. Pls. check if this applies.

With gss_store, it might be still required to cook a connection object in an executor to avoid event loop blocking. But what is important from now on is the executor type which is a thread, not a separate process, and it makes a huge difference in Python.

According to MIT GSSAPI documentation, the store also supports keytab and client_keytab dict keys in addition to ccache. IIUC, specifying client_keytab as /etc/krb5.keytab may be used to authenticate domain computer principal as a client, isolating the credentials storage with ccache. Kerberos is mutual, options potentially make sense for both server and client. However this is all theory:)

many thanks

ronf commented 2 weeks ago

The GSSClient and GSSServer objects are initialized in the constructor of SSHClientConnection and SSHServerConnection, but at that point in time there should be no calls to GSSAPI to create a Credentials or SecurityContext. That happens only when the auth token exchange begins, as part of beginning authentication after the SSH key exchange.

That said, if GSSAPI is making any blocking networking calls of its own to create these objects or to compute the next token in the GSS exchange, I agree that this could block the event loop when auth starts. There are other places in the code (such as calling into the PKCS11 or FIDO libraries) where the default executor is used to avoid this. This should be thread-based, unless the application explicitly changes it. We could do something like that here, but I'd need to understand better which GSSAPI calls might do their own networking or other blocking operations.

I guess the main worry here would be if something like the KDC was down/unreachable, and the requests might be timing out on it. When things are healthy, this rarely seems to block long enough to be a problem. In contrast, the FIDO code actually blocks waiting for the user to touch their security key, blocking the event loop for a significant amount of time, so it was critical to wrap that in an executor.

There's actually already an executor wrapped around the initialization of the options object that we could maybe leverage here, but since options objects can be shared by multiple connections, I think that might only be possible for the credentials, and not for the security context, and I'm not sure I'd want to construct the credentials early, before knowing if the client/server is even going to attempt GSS authentication.

We also need to keep in mind that there's a GSS-based key exchange as well, where this initialization could happen earlier than auth, but I think the same basic rules apply.

zarganum commented 2 weeks ago

@ronf Ron, I highly appreciate your support. Current express testing confirmed I get the functionality I need so much for my project. Many thanks.

Indeed you are right in the terms of network latency VS physical token operations. These are two different horizons.

ronf commented 2 weeks ago

Ron, I highly appreciate your support. Current express testing confirmed I get the functionality I need so much for my project. Many thanks.

My pleasure - glad to hear it is working for you.

Indeed you are right in the terms of network latency VS physical token operations. These are two different horizons.

I looked at this more closely today, and it appears that no external traffic to the KDC happens when creating either the credentials object or the security context. It is only when step() is called on the security context that I see traffic (sometimes) going to the KDC to get the ticket needed to complete the auth. Given that, I think I can just change the GSSBase class to use an executor when calling self._ctx.step(token) and avoid potentially blocking the event loop if the KDC is slow or unreachable. I'll give this a try and see how it goes.

zarganum commented 2 weeks ago

It is only when step() is called

Indeed, having only TGT without ST, it is not possible to authenticate. Client requests ST from KDC using TGT and caches it, this is exactly what you see by invoking step().

Further authentication attempts will also silently use cached ST without calling KDC. If either TGT or ST require renewal, GSSAPI should silently perform more KDC calls. By default, tickets are issued for a period like hours. So single step() call should cover most of the use cases with asyncssh.

The python GSSAPI Credentials supports serialization which should allow ticket persisting. But I honestly don't know how this works with store parameter...

It is also possible to manipulate the ccache on a lower level, e.g. request ST directly and store it, but this is not in the scope of GSSAPI.

Ideally, may I suggest to use k5test for GSSAPI-related unit tests. I will be happy to assist if needed...

Many thanks

ronf commented 2 weeks ago

I looked into this a bit more, and it's not quite as simple as I was hoping. All of the kex & auth code which calls the step() function are synchronous calls right now (mostly packet handlers) which rely on run to completion. So, it's not possible to block on an executor from that code right now, and while I can work around this by moving some of the code into a coroutine which runs in a separate task, that will cause the packet handler code to return and begin processing the next packet before the coroutine finishes off the processing of the previous one.

Unfortunately, I also confirmed that the gssapi code can block the event loop if it tries to contact the KDC and the KDC doesn't respond right away. While this probably won't be a common case, it's something I would like to try and find away to fix.

One thought I had would be to put all calls to packet handlers into a queue per connection, processing the packets in order and waiting for one handler to return before calling the next. The handlers themselves could then be either synchronous or asynchronous, but even in the async case there would be a guaranteed serializing of the handler processing. This would actually let me simplify the handlers in some places, where I previously had to schedule new tasks in places to allow async calls. It may even fix some race conditions I didn't know I had, where packet processing continued before those scheduled tasks finished.

Regarding k5test, it looks useful, but I generally focus my unit tests on testing only the AsyncSSH code, and so I stub out any third-party package calls during the testing. I have such stubs already for both gssapi and the Windows sspi packages, and these allow me to test all the edge cases in my AsyncSSH code, without actually needing a KDC or doing any actual networking that the third-party package would have done.

ronf commented 1 week ago

This feature is now available in AsyncSSH 2.17.0.

I've also begun work to deal with the event loop blocking issue, and will report back when I have something working there. I've decided to be a bit more ambitious and actually migrate from callbacks to stream-based (reader/writer) I/O for the main AsyncSSH connection, allowing for the possibility of both synchronous and asynchronous packer handlers, which will then open up the option for the GSSAPI auth code to wait for KDC communication to run in an executor without blocking any of the other event loop activity.

zarganum commented 1 week ago

awesome news! For my app I'm currently using context manager with MEMORY-type ccache and low-level cc_destroy() to avoid leaks. Well, actually two nested context managers: the higher-level async wrapping a low-level sync one. The outer __anter__/__aexit__ invoke the inner __enter__/__exit__ with run_in_executor() both. Crazy...

So yeah, dual sync/async interface is ambitious indeed!

many thanks

ronf commented 1 week ago

Ok - commit 683b332 is now available in the "develop" branch. It adds support for optionally asynchronous packet handlers for key exchange and auth, and changes the handlers related to GSS to use an executor when calling the step() function. Most of the handlers remain synchronous, but any kex or auth handler that needs it can be defined as "async" and the right thing will happen.

In the process of doing this, I had to change the main SSH connection to use the streams API (reader/writer) instead of the callback API (transport/protocol). As a result the "tunnel" functionality has been changed from using create_connection() to open_connection() in SSHClientConnection. This should be invisible to applications using this with real SSHClientConnection objects (which is the only usage that is publicly documented), but I know some folks have explored doing other forms of tunneling by creating a custom class which implements create_connection(). It's not a difficult change, and in fact it makes customer tunneling much simpler as it can now accept a tuple of StreamReader/StreamWriter directly and not have to rely on AsyncSSH internals, but code relying on those internals may need to change slightly. I'll follow up separately on this in some of the GitHub issues which discussed this.

I'm also still seeing intermittent unit test failures on Python 3.8 which I've been unable to fully track down, but I think the existing code is working well enough to push, and I can continue looking into that prior to cutting a release with this set of changes.

I've also needed to back out commit 3698c93 (and the "fix" for that in 220b9d4), as I found a few more cases where the change was causing some instability in the tests, and causing some exit status/signal information to sometimes get lost. A fix is still needed to handle KeyboardInterrupt better, but this change seemed to be doing more harm than good.

On the positive side, I can now run a background task without interruption while waiting for the GSSAPI code to communicate with the KDC, when it needs to do that. The only thing blocked is the connection doing the kex/auth, and the rest of the connections, sessions, or other tasks in the asyncio event loop continue normally.

ronf commented 5 days ago

Unfortunately, after multiple days of debugging, I'm unable to get the unit tests to be stable with the changes above. I've decided for now to roll back these changes, and I'll try to find another less complicated way to allow GSSAPI calls to not block the event loop. In the meantime, I wanted to make sure anyone picking up the head of "develop" has something stable to test against.