ruby / openssl

Provides SSL, TLS and general purpose cryptography.
Other
240 stars 167 forks source link

`SSL_write: bad length` #793

Open ioquatix opened 2 months ago

ioquatix commented 2 months ago

I've noticed errors in Falcon when using recent versions of this gem.

Specifically, SSL_write: bad length is common.

I was checking this issue: https://github.com/openssl/openssl/issues/21002

I found this warning as a result: https://docs.openssl.org/master/man3/SSL_write/#warnings

I wonder if we are running into this issue?

I'll try to make a small reproduction.

ioquatix commented 2 months ago

The code in question:

static VALUE
ossl_ssl_write_internal(VALUE self, VALUE str, VALUE opts)
{
    SSL *ssl;
    rb_io_t *fptr;
    int num, nonblock = opts != Qfalse;
    VALUE tmp;

    GetSSL(self, ssl);
    if (!ssl_started(ssl))
        rb_raise(eSSLError, "SSL session is not started yet");

    tmp = rb_str_new_frozen(StringValue(str));
    VALUE io = rb_attr_get(self, id_i_io);
    GetOpenFile(io, fptr);

    /* SSL_write(3ssl) manpage states num == 0 is undefined */
    num = RSTRING_LENINT(tmp);
    if (num == 0)
        return INT2FIX(0);

    for (;;) {
        int nwritten = SSL_write(ssl, RSTRING_PTR(tmp), num);
        switch (ssl_get_error(ssl, nwritten)) {
          case SSL_ERROR_NONE:
            return INT2NUM(nwritten);
          case SSL_ERROR_WANT_WRITE:
            if (no_exception_p(opts)) { return sym_wait_writable; }
            write_would_block(nonblock);
            io_wait_writable(io);
            continue;
          case SSL_ERROR_WANT_READ:
            if (no_exception_p(opts)) { return sym_wait_readable; }
            read_would_block(nonblock);
            io_wait_readable(io);
            continue;
          case SSL_ERROR_SYSCALL:
#ifdef __APPLE__
            /*
             * It appears that send syscall can return EPROTOTYPE if the
             * socket is being torn down. Retry to get a proper errno to
             * make the error handling in line with the socket library.
             * [Bug #14713] https://bugs.ruby-lang.org/issues/14713
             */
            if (errno == EPROTOTYPE)
                continue;
#endif
            if (errno) rb_sys_fail(0);
            /* fallthrough */
          default:
            ossl_raise(eSSLError, "SSL_write");
        }
    }
}
ioquatix commented 2 months ago

I see we are also setting SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER. However, because of if (no_exception_p(opts)) { return sym_wait_readable; } I imagine it's possible to re-enter this code with a completely different buffer?

rhenium commented 2 months ago

If you can make a reproducer, you might get more information by enabling OpenSSL.debug=true. The error reason ("bad length", SSL_R_BAD_LENGTH) apparently is used for multiple different errors in libssl.

rhenium commented 2 months ago

I see we are also setting SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER. However, because of if (no_exception_p(opts)) { return sym_wait_readable; } I imagine it's possible to re-enter this code with a completely different buffer?

It's possible, although it would be an error in user code. The first write_nonblock call can make a partial write of a TLS record to the underlying socket, so the subsequent retrying write_nonblock calls are required to give the same buffer.

By default OpenSSL checks if the buffer pointer is the same, as a quick sanity check, and SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER is an option to suppresses it. Commit b0996b86f60389184a9c9f10040ceb820f2b9401 in 2016 enabled it in ruby/openssl.

One kind of OpenSSL::SSL::SSLError: SSL_write: bad length appears to be raised when the new buffer is smaller than the original buffer. A reproducer for it:

diff --git a/test/openssl/test_pair.rb b/test/openssl/test_pair.rb
index 10942191dd76..bc638b121334 100644
--- a/test/openssl/test_pair.rb
+++ b/test/openssl/test_pair.rb
@@ -368,7 +368,7 @@ def test_write_nonblock_retry
       assert_equal written, readed

       # this fails if SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER is missing:
-      buf2 = Marshal.load(Marshal.dump(buf))
+      buf2 = Marshal.load(Marshal.dump(buf))[0..100]
       assert_kind_of Integer, s1.write_nonblock(buf2, exception: false)
     }
   end
ioquatix commented 2 months ago

I found a separate bug due to data corruption in my program. Maybe it was responsible for the error reported here… I also saw double free and core dump due to my invalid handling of OpenSSL sockets. I fixed the root cause and the problems appear to have all resolved, but I’ll keep an eye on it to see if bad length shows up again.

rhenium commented 2 months ago

Using OpenSSL::SSL incorrectly isn't supposed to cause a double free or core dump, as we try to avoid C-level undefined behavior. There must be something wrong.

ioquatix commented 2 months ago

I observed double free segfault once and several sigabrt in production environment due to my programming error. I don't have time right now to investigate, but I'll try to circle back to this issue in the future to investigate. I think it's something to do with SSLSocket#accept.

ioquatix commented 1 month ago

For reference, I observed this again:

               | Task may have ended with unhandled exception.
               |   OpenSSL::SSL::SSLError: SSL_write: bad length
               |   → /home/samuel/.gem/ruby/3.3.4/gems/openssl-3.2.0/lib/openssl/buffering.rb:412 in `syswrite_nonblock'
               |     /home/samuel/.gem/ruby/3.3.4/gems/openssl-3.2.0/lib/openssl/buffering.rb:412 in `write_nonblock'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/buffered.rb:93 in `syswrite'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/generic.rb:141 in `block in flush'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/generic.rb:136 in `synchronize'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/generic.rb:136 in `flush'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/framer.rb:45 in `flush'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/connection.rb:220 in `block in write_frame'
               |     /home/samuel/.gem/ruby/3.3.4/gems/async-2.17.0/lib/async/semaphore.rb:87 in `acquire'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/connection.rb:39 in `synchronize'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/connection.rb:216 in `write_frame'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/stream.rb:119 in `write_frame'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/stream.rb:258 in `send_reset_stream'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/stream.rb:133 in `finish_output'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/output.rb:56 in `close_write'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/output.rb:108 in `ensure in passthrough'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/output.rb:108 in `passthrough'
               |     /home/samuel/.gem/ruby/3.3.4/gems/async-2.17.0/lib/async/task.rb:197 in `block in run'
               |     /home/samuel/.gem/ruby/3.3.4/gems/async-2.17.0/lib/async/task.rb:422 in `block in schedule'
               |   Caused by OpenSSL::SSL::SSLError: SSL_write
               |   → /home/samuel/.gem/ruby/3.3.4/gems/openssl-3.2.0/lib/openssl/buffering.rb:412 in `syswrite_nonblock'
               |     /home/samuel/.gem/ruby/3.3.4/gems/openssl-3.2.0/lib/openssl/buffering.rb:412 in `write_nonblock'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/buffered.rb:93 in `syswrite'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/generic.rb:141 in `block in flush'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/generic.rb:136 in `synchronize'
               |     /home/samuel/.gem/ruby/3.3.4/gems/io-stream-0.4.1/lib/io/stream/generic.rb:136 in `flush'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/framer.rb:45 in `flush'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/connection.rb:220 in `block in write_frame'
               |     /home/samuel/.gem/ruby/3.3.4/gems/async-2.17.0/lib/async/semaphore.rb:87 in `acquire'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/connection.rb:39 in `synchronize'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/connection.rb:216 in `write_frame'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/stream.rb:119 in `write_frame'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/stream.rb:201 in `write_data'
               |     /home/samuel/Developer/socketry/protocol-http2/lib/protocol/http2/stream.rb:214 in `send_data'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/output.rb:117 in `send_data'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/output.rb:49 in `write'
               |     /home/samuel/Developer/socketry/async-http/lib/async/http/protocol/http2/output.rb:93 in `passthrough'
               |     /home/samuel/.gem/ruby/3.3.4/gems/async-2.17.0/lib/async/task.rb:197 in `block in run'
               |     /home/samuel/.gem/ruby/3.3.4/gems/async-2.17.0/lib/async/task.rb:422 in `block in schedule'

The context is a bit different this time, maybe we were trying to write to a closed connection.