dabeaz / curio

Good Curio!
Other
4.04k stars 243 forks source link

I think duplex I/O on SSL sockets can break the kernel's invariants? #198

Closed njsmith closed 7 years ago

njsmith commented 7 years ago

Suppose I have two tasks, one of which is looping on reading from an SSL socket and one of which is looping on writing to that same socket. (This would happen, e.g., in an HTTP/2 implementation.) In general, this kind of socket usage is totally OK and explicitly supported by curio. But, curio does have a rule that you can't have two tasks that are both trying to read from the same socket at the same time, or both trying to write to the same socket at the same time.

...But with an SSL socket, doing recv can trigger a _write_wait, and send can trigger a _read_wait, which means that if one task is calling recv and another task is calling send, then they might both end up in _read_wait or _write_wait at the same time, and then the kernel will get pissed off.

This is unfortunately difficult to test for too, since it's very rare that an SSL socket actually needs to read before it can write or vice-versa, but it does happen.

dabeaz commented 7 years ago

I agree that there is a possibly a small chance of this happening, but without reading the source of OpenSSL or a concrete example that causes this kind of crash, I'm not sure it's something I want to spend my energy on. Even then, I'm not sure my solution would be anything more than a spin-retry. I've never seen any evidence that other async frameworks allow multiple reader/writers to wait on a socket. I could have missed it, but has anyone else addressed this?

njsmith commented 7 years ago

I guess I feel like, if curio supports SSL recv raising WantWrite, and full duplex use of SSL sockets, then it should probably support both of these features being used together?

The only way I can make sense of this is if you're convinced that SSL recv never actually does raise WantWrite, but in that case why have the code path for that at all?

dabeaz commented 7 years ago

I don't want to give the impression that I'm not concerned about reliability, but I'm not sure it's the role of the Curio kernel to solve every possible problem that might occur on the network. In the big picture, having multiple tasks trying to read or multiple tasks trying to write on the same socket at the same time is already a major application-level bug in my opinion. It makes no sense for stream-based I/O unless you want everything randomly garbled. At the moment, Curio raises an exception if you try to do this. It's maybe not ideal, but it's a step above asyncio where a second attempt to perform an operation (i.e., two reads on the same socket) seems to result in the first operation being silently cancelled.

With respect to SSL, it's not clear to me that it supports full-duplex I/O in any kind of traditional sense. I found this thread: https://groups.google.com/forum/#!topic/mailing.openssl.users/scU_UV-VuGc

It's pretty dense, but the gist of it that I get is that an SSL connection is running an underlying state machine. Within this state machine, it's not possible for multiple threads to be doing different things on an SSL connection at the same time (i.e., one thread reading, one thread writing). It's a little less clear with respect to async/non-blocking, but similar rules may be in effect. For example, if you start an SSL write and it comes back with a WantRead condition, you have to honor that and continue to proceed with the steps involved in the write operation until it completes successfully. Only after it's fully done, can you then proceed to a new operation such as a read/write. One way to manage this in the presence of threads/tasks might be to protect read/writes with a mutex lock. Or to use some kind of queuing.

It's entirely possible that my interpretation of this is wrong, but either way, it all sounds rather hairy to me. I think my inclination is simply to have Curio fail loudly with an exception if there's some kind of resource conflict (i.e., two tasks attempting to read on the same socket). Leave it up to the user to figure out how to negotiate the conflict.

imrn commented 7 years ago

May be we should be talking at selector level. Scenario is as follows:

dabeaz commented 7 years ago

Update: it seems that two tasks reading on the same socket in asyncio has even weirder behavior. The first task just disappears off the face of the earth to never be seen again. The socket operation never returns. No exception is raised. The whole task just disappears. You'll get a warning about it when Python exits.

dabeaz commented 7 years ago

Regarding what happens in Curio right now, two read requests on the same socket result in an exception being raised in the second request.

imrn commented 7 years ago

May be the issue is related with the very basics: "Is SSL/TLS/etc.. full-duplex?" If it is not (any comments?) we shouldn't be treating it as so, with multiple tasks reading and writing. May be for such cases we need a middleman facing those tasks, managing the traffic for full-duplex emulation.

dabeaz commented 7 years ago

At best, the full duplex capabilities of SSL seem rather muddy to me. People seem to think that it's possible with non-blocking I/O in some capacity, but even then, it doesn't seem absolutely straightforward.

imrn commented 7 years ago

There is SSLObject in python ssl library exposing the ssl functionality without any networking. Did anyone tortured it with any combinations of full-duplex reads and writes?

imrn commented 7 years ago

I think this is an issue that needs serious thinking even if SSL does not support duplex operation. (To be confirmed: Is it so or not?)

Suppose we have an application running full-duplex on normal sockets. Now, we just want to deploy it in a SSL/TLS setup. You can not do it without serious restructuring.

Even worse, if it is dominantly built around full-duplex logic then it will have no chance of running under SSL.

njsmith commented 7 years ago

Of course SSL supports full duplex operation.

dabeaz commented 7 years ago

I'm not fully sure what the full solution to this might be, I just know that the innards of the Curio kernel is not the place to be putting it. That said, I have made a slight addition to the I/O traps that allows the behavior of duplicate readers/writers to be controlled a bit more finely. Specifically, an I/O trap can elect to fail with an exception if there's another task waiting on the same event (the default) or it can take precedence and make the already waiting task fail with the exception. I have a sneaking suspicion that this latter behavior will be useful in addressing this SSL issue.

imrn commented 7 years ago

On 3/16/17, Nathaniel J. Smith notifications@github.com wrote:

Of course SSL supports full duplex operation.

Probably it is. But a quick search does not provide any good docs about it. There must be some read-write sequencing for a thing that needs to read while writing or needs to write while reading. Otherwise real reads and reads related to write can mix-up.

And if there is a "sequencing" relation, we have to rethink the meaning of "full" duplex. May be the libraries providing that illusion. But with regard to spec and real traffic on the tcp level I have doubts.

Anyway I'm not an expert on SSL/TLS so take my comments with grain and salt. Any good pointers on the subject will be appreciated.

dabeaz commented 7 years ago

Here is my intuitive guess on SSL and how it likely works.

Under the covers, an SSL connection is stateful. There are different things that can happen on the connection. This is part of the reason why a request to read might require writing and a request to write might require reading. My understanding is that if you try a recv() operation and it fails with a WantWrite, you poll until you can write and then you re-issue the recv() operation again. Same goes with a send() operation. If it fails with a WantRead, you poll for that and then re-issue the send() call again. The underlying connection knows what it was trying to do. When you re-issue the failed request, it picks off with what it was doing from before. Curio currently does this.

I think where it gets a little wild is in the duplex situation... suppose a task issues a recv() request and there is simply no data available at all. In this case, the task sleeps and waits for some I/O before retrying the recv(). If, during this time period however, a different task decides to send on the socket and it fails with a WantRead, what happens now? I'm thinking that the task performing the send will probably want to kick off the other blocked receiving task, poll for reading itself, and then re-issue the send() call. It's the only thing that makes any sense--I mean, if the send was in the middle of something and all the sudden it needs to read, whatever is going on with the connection is directly related to that. You're not going to be able to just jump in and issue an SSL recv() operation out of the blue here---that send() is going to have to finish whatever it was doing first.

With that in mind, the socket code will likely turn into something sort of like this:

 async def send(self, data):
     while True:
        try:
             return await self._sock.send(data)     # Send on the real socket
        except WantWrite:
             await _write_wait(self._sock)
        except WantRead:
             await _read_wait(self._sock, priority=HIGH)    # Kick off any other task
        except ResourceBusy:
             # We got kicked off by another task that needed to send while receiving
             # Do something, block, wait on an event, spin, or whatever, until we get an all clear
             ???
             profit            

Of course, I could be totally wrong.... ;-)

imrn commented 7 years ago

There are certain points the socket should be written and read with a certain payload determined by the SSL machinery. (Independent of whether you are reading or writing) This makes SSL stateful and here I think there is a contradiction with full-duplex operation.

For Curio how about dropping SSLSocket and using SSLObject? Can such a layering (pure socket read/writes at the bottom and SSLObject logic on top) help or even provide more flexibility for handling of such cases?

dabeaz commented 7 years ago

All things equal, I'd rather operate at the socket level than some much lower level SSL interface. I will investigate further as time allows.

dabeaz commented 7 years ago

This is more of an open non-Curio question.... what would be involved in designing a reproducible test that could trip any of this behavior? (e.g., getting a WantRead exception on writing for instance). I've spent a bit of time reading about some of this and have tried a few unsuccessful experiments. It doesn't seem entirely straightforward or easy.

njsmith commented 7 years ago

I've been wondering that too. My best guess so far is to use PyOpenSSL and then trigger a manual renegotiation. (Unfortunately the stdlib ssl module doesn't expose the API for this.)

It might be possible during the initial handshake too, maybe.

njsmith commented 7 years ago

This might be helpful: https://stackoverflow.com/questions/31713302/how-to-do-tls-renegotiation-by-python-ssl-socket

AndreLouisCaron commented 7 years ago

Asking whether SSL is full duplex seems like an odd way to frame this problem. The "want read" and "want write" mean that the SSL state machine needs data or has negotation data to send to the peer -- not that the socket itself needs reading or writing.

It seems to me like using a selector to check if an "SSL socket" is readable has underspecified semantics unless the selector is aware of SSL and "readable" means "application data has arrived". This is explained in the Notes on non-blocking sockets section of the SSL module docs. My understanding is that if Curio uses an "SSL socket" wrapper object, it cannot equate "the selector says the (raw) socket is readable" to "the Curio socket is readable".

dabeaz commented 7 years ago

Curio is strictly following the recommended usage of nonblocking SSL sockets which states that you always must try the I/O operation first (recv/send) before relying on selectors to tell you when data is available. https://docs.python.org/3/library/ssl.html#notes-on-non-blocking-sockets

I'm still not convinced about the sanity of full-duplex operation in combination with renegotiation though. Still fooling around with it.

dabeaz commented 7 years ago

With respect to HTTP/2, I found this in the spec. http://http2.github.io/http2-spec/#TLSUsage

It seems that renegotiation is explicitly NOT allowed except in the initial handshake of setting up the connection. So maybe, just maybe, we're basically in the clear on that. One of the things that's really been bothering me is just how all of this would play out in HTTP/2 where there is all of this multiplexing and other stuff is going on. I mean, suppose a client sends a renegotiation message. What exactly happens at that point? Especially if there's a bunch of downstream data still coming on the wire? Does the universe explode? Apparently not since it's not allowed.

I've done some various experiments with Curio and full-duplex SSL connections over the last few days and it seems to work. I've also done some messing around with fragmenting the raw data stream into small pieces so that each end would only get partial SSL messages on a single recv() on the low-level socket. That all seems to work too.

imrn commented 7 years ago

On 3/19/17, André Caron notifications@github.com wrote:

Asking whether SSL is dull duplex seems like an odd way to frame this problem. The "want read" and "want write" mean that the SSL state machine needs data or has negotation data to send to the peer -- not that the socket itself needs reading or writing.

It seems to me like using a selector to check if an "SSL socket" is readable has underspecified semantics unless the selector is aware of SSL and "readable" means "application data has arrived". This is explained in the Notes on non-blocking sockets section of the SSL module docs. My understanding is that if Curio uses an "SSL socket" wrapper object, it cannot equate "the selector says the (raw) socket is readable" to "the Curio socket is readable".

The very basics of async operation is halting the current execution branch at a "physical" necessity without halting the rest of the program. Having said that I'm increasingly leaning toward the idea of seeing the "async core" as a mere traffic manager at those "physical" boundries.

There may be upper layers depending on the preferences of the framework used but for me the "core" should be shaped by this motto.

So, the SSL has no place at the core. It is only an other layer or perhaps a mere "application" receving some bytes and outputing some bytes. That's all. It is that simple and I think it has no place at the "core".

If it is brought in then it will bring such irregularities with it.

dabeaz commented 7 years ago

Okay. I managed to reproduce a scenario on which an SSLWantReadError exception occurs when attempting a send(). You can get it on an SSL server if it attempts to send data right away. For example::

from socket import *
s = socket(AF_INET, SOCK_STREAM)
s.bind(('', 30000))
s.listen(1)
c, a = s.accept()       
# Make a connection with another interpreter (BUT DON'T READ/WRITE). Just open it

c.setblocking(False)

# Wrap with SSL
import ssl
context = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
context.load_cert_chain(certfile='ssl_test.crt', keyfile='ssl_test_rsa')
c_ssl = context.wrap_socket(c, server_side=True, do_handshake_on_connect=False)

# Blow it up
c_ssl.send(b'Hello World')   # BOOM: ssl.SSLWantReadError

It blows up because the client hasn't sent the handshake message yet.

Still need to think about the reverse scenario (getting a WantWrite on recv). Probably something similar.

Lukasa commented 7 years ago

WantWrite on recv is probably hard. It would almost certainly occur if you can fill up the send buffer such that a nonblocking send operation would raise EAGAIN midway through a renegotiation handshake. This could be done with very careful balancing of socket buffers, but it'd be hard.

dabeaz commented 7 years ago

Getting a WantWrite on recv() seems really tough to me. I don't see how you'd get it on the side of a connection that issued do_handshake() because that side wouldn't proceed until the handshake was done. On the other side, you'd have to fill up the send buffer prior to getting the handshake message. But, if that's the case, it seems like the handshake would surely fail. The other end of the connection would receive a bunch of garbage on the connection before it got the handshake response.

Lukasa commented 7 years ago

@dabeaz "a bunch of garbage" isn't quite right, I don't think. We're into a tricky place with TLS, but AFAIK the renegotiation doesn't prevent the older data being understood, at least not until it's complete.

dabeaz commented 7 years ago

I guess I'm thinking in terms of a request/response cycle. Renegotiation certainly involves sending a request, but does the response to that request need to come back immediately? Or can data still "in flight" be received ahead of the response? Does it even make sense to do that? It seems like it might imply there was a concurrent recv() happening at the same time as the handshake. Maybe "bunch of garbage" isn't the best description, but it still seems "dicey." ;-)

Lukasa commented 7 years ago

Data "in flight" can definitely be ahead of the response. You sending a renegotiation request must be able to race with data the remote peer already sent.

dabeaz commented 7 years ago

I've been thinking a lot about this issue and have concluded that it is basically an application level concern that can be solved with proper forethought and synchronization. For example, the only time I could even trigger this condition is in code that simultaneously tried to send/recv right at the start of an SSL connection using multiple tasks. For example:

async def incoming(client):
        while True:
                data = await client.recv(100000)
                ...

async def outgoing(client):
        await client.send(b'hello\n')
        ...

async def handle_connnection(client, addr):
       t1 = await spawn(incoming, client)
       t2 = await spawn(outgoing, client)

One of the tasks crashes with an exception due to two tasks simultaneously read-waiting on the file descriptor (the send waiting to read in order to process the client handshake). Yes, it's a problem, but it's easily fixed with an Event. For example::

async def incoming(client, hello_evt):
        await hello_evt.wait()
        while True:
                data = await client.recv(100000)
                ...

async def outgoing(client, hello_evt):
        await client.send(b'hello\n')
        await hello_evt.set()
        ...

async def handle_connnection(client, addr):
       evt = Event()
       t1 = await spawn(incoming, client, evt)
       t2 = await spawn(outgoing, client, evt)

Although it's a tricky corner case, I'd rather leave it out of the Curio kernel proper. Curio rightfully reports an exception if something weird happens. As far as I'm concerned, anything related to SSL renegotiation probably needs to be carefully coordinated at the application level anyways. For example HTTP/2 seems to forbid it entirely.

Bottom line: I'd much rather err on the side of keeping the Curio kernel simple. As long as it allows the problem to ultimately be solved at the application level, I'm okay with that.

njsmith commented 7 years ago

Traditionally, renegotiation is something that can be legally triggered at any time by the remote peer; while opinion seems to have shifted against it these days, the original idea was that libraries should automatically and transparently trigger it on some kind of timer. (One of the major motivations for renegotiation is to put an upper bound on how much data get transferred under a single key, so for long-lived connections you're supposed to do it every so often.) Just not supporting renegotiation is a option, but it should be clear that it's not spec compliant (except for when speaking HTTP/2) and may cause interoperability problems.

dabeaz commented 7 years ago

There is nothing about Curio that prohibits renegotiation of an SSL connection. In fact, it works fine.

This issue is solely about the behavior of the kernel. The Curio kernel is only focused on task scheduling, not the implementation details of any particular protocol or application. There is a requirement that only one task can wait to read/write on a single I/O resource. Otherwise you get a ResourceBusy exception. This is not unlike virtually every other async library I have ever seen, including asyncio. For example, asyncio doesn't allow more than one Future to be attached to reading or writing a given file descriptor. I don't have an issue with this restriction.

In the context of SSL, there is a chance in certain atypical circumstances that one could get a resource conflict in the initial handshaking of a connection (for example, attempting to send/recv before the handshake is complete). However, this conflict is easily solved using the same techniques that most resource conflicts are usually solved--use locks and events. Curio will loudly raise an exception. It's not an error that will pass silently.

On an already established SSL connection, there is a chance that the other end of the connection could perform a renegotiation at some point. That works fine now. Yes, there is an outside chance it could cause a resource conflict error, but the circumstances leading to that seem about as likely as me witnessing a dragon swoop out of the sky to pluck the Lochness monster out of Lake Michigan under the pale light of a lunar eclipse on my bike ride. I'm not saying that would never happen--I've seen stranger things. However, it's not something I'm really going to worry about until I witness a real-life bug report from an actual application saying that it happened.

Even then, I don't think it's the responsibility of the Curio kernel (as a scheduler) to prevent or fix it. A solution would be more likely found elsewhere--either at the application level (using some kind of more sophisticated exception handling and/or task coordination) or it would happen at the socket level (maybe having a different kind of SSL socket).