chriskohlhoff / asio

Asio C++ Library
http://think-async.com/Asio
4.72k stars 1.19k forks source link

asio::ssl::stream::async_read_once null_buffers causes busy-loop #1015

Open adamncasey opened 2 years ago

adamncasey commented 2 years ago

I've found an issue where calling async_read_some with the null_buffers technique (idiom?) causes an application to busy loop. Worth noting there is no async_wait equivalent available here either. Below describes the steps we go through from invoking async_read_some to nearly immediately having the handler invoked when there's no data.

A small application along the lines of this highlights the issue (sorry it doesn't compile - on my todo list):

boost::asio::ssl::stream<boost::asio::ip::tcp::socket> socket;
// get a socket from a listening TLS context.
// Our application listens for an incoming TLS connection - unsure if this also happens with outbound connections.
// Set socket non-blocking

sock.async_read_some(boost::asio::null_buffers(), [](error_code ec, std::size_t) {
    if (ec) {
        // handle error
    }

    size_t readAmount = socket.read_some(/*actual buffer*/, ec);
    if (ec == boost::asio::error::would_block) {
        // re-schedule async_read_some
    }
});

I believe the issue is that stream::async_read_some has no null_buffers specialisation, and therefore we eventually end up down in engine::read with an empty buffer.

https://github.com/chriskohlhoff/asio/blob/b89fcb8951936e466d918b3ad03660121bfa842f/asio/include/asio/ssl/detail/impl/engine.ipp#L202-L209

This returns want_nothing, instead of what I think should be something closer to want_input_and_retry.

want_nothing causes io_op::operator() to get here: https://github.com/chriskohlhoff/asio/blob/b89fcb8951936e466d918b3ad03660121bfa842f/asio/include/asio/ssl/detail/io.hpp#L230-L246

Which triggers an empty read event such that io_op gets re-executed in the next(ish?) executor loop iteration.

When we come back round to ioop `start` is set to 0

https://github.com/chriskohlhoff/asio/blob/b89fcb8951936e466d918b3ad03660121bfa842f/asio/include/asio/ssl/detail/io.hpp#L145-L148

and then we jump into a default: statement nested inside of the do { } while(); loop (which it's fair to say, surprised me):

https://github.com/chriskohlhoff/asio/blob/b89fcb8951936e466d918b3ad03660121bfa842f/asio/include/asio/ssl/detail/io.hpp#L255-L263

want_ hasn't been re-written since the last time we were here, so in my case it's still want_nothing

.. and finally we end up invoking the user-provided handler (via readop) just inside the `switch(want)` with zero bytes read and no error code. Our application (as would others) then invoke async_read_some again again (and there's the busy loop).

https://github.com/chriskohlhoff/asio/blob/b89fcb8951936e466d918b3ad03660121bfa842f/asio/include/asio/ssl/detail/io.hpp#L305-L313

I found at least one other person on stackoverflow has hit this issue before, a long time ago: https://stackoverflow.com/questions/40163626/boost-asio-ssl-not-working-as-expected-when-used-with-null-buffers

I think that there is something wrong here, unfortunately I just don't quite see where the best place to fix this issue is.

Perhaps some kind of async_wait / async_read_some(boost::asio::null_buffers) specialisation could drop us straight into this logic (or something like it) ?

https://github.com/chriskohlhoff/asio/blob/b89fcb8951936e466d918b3ad03660121bfa842f/asio/include/asio/ssl/detail/io.hpp#L157-L189

adamncasey commented 2 years ago

Looking at this a bit deeper - I think this is hard to solve because for these reasons:

  1. We want to call to asio::ssl::stream::async_read_some with no buffer to store data (basically so we can create a perfectly sized buffer when data is available)
  2. OpenSSL's SSL_read method only consumes data into a user buffer. There doesn't seem to be a function which lets you just move socket data into openssl's internal buffer
  3. The async_read_some handler is passed the number of bytes available, and this would logically be the number of application-bytes you could immediately get if you called ssl::stream::read_some.

Without a solution to 2) I don't see how anything can call the handler function in 3) with the correct number of available bytes without a buffer to do at least some of the work.

I think I see two possible solutions here:

  1. Perhaps we can call SSL_read with a zero byte buffer. I see some code in there which is supposed to guarantee forward progress over non-data records even when called with zero bytes. There is some specific code in asio::ssl which avoids calling SSL_read with a zero byte buffer though so I'm not sure if this can work. The handler should always be called with the result of SSL_pending() so users can allocate a buffer for all of the available data.

  2. Null_buffers could be emulated by always passing openssl a very small buffer (even one byte might be enough). A stream::available() method could return the result of SSL_pending() plus this extra byte, when available. All stream read methods would need to be adjusted to read this byte when it is available.

Implementing 2) on the application side might make the most sense, it's just a shame this won't help applications out there using null_buffers with asio::ssl at the moment.

ppadevski-vmw commented 1 year ago

I'm also seeing this issue, though because of a different reason.

Looking at the example, the way it was written was to basically do async_read_some(null_buffers(), ...) ... but on the next layer which was in most cases tcp and ended up in epoll. It was wrong but used to work because up to TLS 1.2 the SSL handshake had to complete before application data was sent and in between requests for HTTP/1.1 epoll seemed to work okayish (next request received X time after response and epoll notices the bytes).

With TLS 1.3, however, this is no longer the case and the application data and the client certificate ended up being buffered in OpenSSL but not yet being processed. Hence epoll now gets stuck as there's no data in the socket.

The reasonable solution was to use sslsocket.async_read_some(null_buffers(), ...) and now I'm seeing this busy loop.

I'm going to use the same workaround that reads 1 byte but that looks really inefficient. There's an optimization involving SSL_pending but that would require me to add a ton of extra unit tests to make sure I have covered all of the corner cases (including e.g. read 0).