ruby / openssl

Provides SSL, TLS and general purpose cryptography.
Other
240 stars 167 forks source link

TLS 1.3 connection throwing `Errno::ECONNRESET` #765

Open Abhishek-Bhatta opened 5 months ago

Abhishek-Bhatta commented 5 months ago

Prerequisites

Issue

Hi, I am trying to run the following script in irb on Ruby 3.2.2. I am using OpenSSL 3.2.0. This web request works in Python and also in JRuby but is failing in CRuby. I believe it's a bug in the underlying C implementation of the openssl gem but I've hit my limit in terms of debugging it. Can anyone please investigate if it is indeed a bug or if it's a server-side implementation quirk?

require "net/http"
require "openssl"

begin
  url = "https://payments.cat.uk.pt-x.com/payments-service/api/security/handshake"
  uri = URI.parse(url)
  http = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true
  http.min_version = OpenSSL::SSL::TLS1_3_VERSION
  http.max_version = OpenSSL::SSL::TLS1_3_VERSION
  http.verify_mode = OpenSSL::SSL::VERIFY_NONE # OpenSSL::SSL::VERIFY_PEER, OpenSSL::SSL::VERIFY_NONE
  resp = http.get(uri.request_uri)
rescue => exception
  puts exception.backtrace
  raise exception
end

This gives me

/Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/openssl-3.2.0/lib/openssl/buffering.rb:211:in `sysread_nonblock': Connection reset by peer (Errno::ECONNRESET)
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/openssl-3.2.0/lib/openssl/buffering.rb:211:in `read_nonblock'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/protocol.rb:218:in `rbuf_fill'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/protocol.rb:199:in `readuntil'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/protocol.rb:209:in `readline'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http/response.rb:158:in `read_status_line'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http/response.rb:147:in `read_new'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1862:in `block in transport_request'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1853:in `catch'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1853:in `transport_request'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1826:in `request'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1819:in `block in request'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1238:in `start'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1817:in `request'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/3.2.0/net/http.rb:1575:in `get'
    from (irb):14:in `<main>'
    from /Users/abhishek/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/irb-1.6.2/exe/irb:11:in `<top (required)>'
    ... 2 levels...

on all my machines. Setting min and max version to TLS 1.2, however, works on all installed Ruby versions and all machines.

I checked JRuby's ruby interface of the openssl gem, and it seems to be a nearly identical copy. I tried using their buffering.rb in lieu of this one, but I ran into the same connection reset issue. If I were to hazard a guess, the sysread_nonblock implementation behavior differs here in this gem's implementation vs the JRuby implementation, which leads to the difference in behavior. I've also attached a pcap file of the call when I attempt it through Ruby, and it resets. I don't see any obvious issues but I am no network expert.

(TIL GitHub doesn't like pcap so it's attached as a tar 🤷‍♂️) tls13github.tar.gz

rhenium commented 5 months ago

I reproduced the error myself. The server is actually sending an RST, aborting the TLS connection.

The server (payments.cat.uk.pt-x.com:443) doesn't fully support TLS 1.3 and seems to require the workaround mentioned in https://datatracker.ietf.org/doc/html/rfc8446#appendix-D.4

# This works
$ echo -en 'GET / HTTP/1.0\r\n\r\n' | openssl s_client -connect payments.cat.uk.pt-x.com:443 -servername payments.cat.uk.pt-x.com -ign_eof
[...]

# This doesn't
$ echo -en 'GET / HTTP/1.0\r\n\r\n' | openssl s_client -connect payments.cat.uk.pt-x.com:443 -servername payments.cat.uk.pt-x.com -ign_eof -no_middlebox
[...]
read:errno=104

Currently, you can enable the middlebox compatibility mode with OpenSSL::SSL::SSLContext#options=:

ssl_context.options |= OpenSSL::SSL::OP_ENABLE_MIDDLEBOX_COMPAT
rhenium commented 5 months ago

According to SSL_CTX_set_options(3), OpenSSL::SSL::OP_ENABLE_MIDDLEBOX_COMPAT is enabled by default.

However, it's not enabled in net/http because SSLContext#set_params overwrites SSL options with OpenSSL::SSL::OP_ALL & ~OpenSSL::SSL::OP_DONT_INSERT_EMPTY_FRAGMENTS | OpenSSL::SSL::OP_NO_COMPRESSION: https://github.com/ruby/openssl/blob/72d1be92edfbf5ad8b99bae61230e72694cc61bb/lib/openssl/ssl.rb#L148

IMO it should preserve options that are set by default/by the OpenSSL configuration file. Somewhat related: https://github.com/ruby/openssl/issues/709

Abhishek-Bhatta commented 5 months ago

Wow, many thanks for looking into this, I was really struggling. ❤️

Abhishek-Bhatta commented 5 months ago

Hey @rhenium, I had one more question. I ran this to disable middlebox compatibility, and it still works

openssl s_client -connect payments.cat.uk.pt-x.com:443 -tls1_3 -no_middlebox -strict

Does this mean server properly supports TLS 1.3 since it works with and without middlebox flag? I feel server's implementation of TLS 1.3 is correct. Is some server side change possible that would allow existing openssl gem version to work with TLS 1.3? I compared 1.3 and 1.2 and it seems that in TLS 1.3, we don't get a SSL session ticket (I hope this is what it's called). Is my conclusion correct? image

I did some more testing to see if there are any TLS 1.3 implementations that behave similarly and I found Google's API also behaves similarly. Here are some tests:

irb(main):001:0> RestClient.get "https://payments.cat.uk.pt-x.com/payments-service/api/security/handshake"
Traceback (most recent call last):
Errno::ECONNRESET (Connection reset by peer)
irb(main):002:0> RestClient.get "https://googleapis.com"
Traceback (most recent call last):
RestClient::NotFound (404 Not Found)

Here are the corresponding openssl tests

$ openssl s_client -connect googleapis.com:443 -tls1_3
...
---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: ECDSA
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 9636 bytes and written 318 bytes
Verification: OK
---
New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
Server public key is 256 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
DONE

and for the original host

$ openssl s_client -connect payments.cat.uk.pt-x.com:443 -tls1_3
...
---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA-PSS
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 4056 bytes and written 312 bytes
Verification: OK
---
New, TLSv1.3, Cipher is TLS_AES_128_GCM_SHA256
Server public key is 4096 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
DONE

Only real difference here is Cipher and size of public key. I also did openssl s_client -msg for both to compare protocol messages and I see some difference here:

Google

<<< TLS 1.3, Handshake [length 2481], Certificate
<<< TLS 1.3, Handshake [length 004e], CertificateVerify
<<< TLS 1.3, Handshake [length 0034], Finished

payments.cat.uk.pt-x.com

<<< ??? [length 0005]
<<< TLS 1.3 [length 0001]
<<< TLS 1.3, Handshake [length 0cc9], Certificate
<<< ??? [length 0005]
<<< TLS 1.3 [length 0001]
<<< TLS 1.3, Handshake [length 0208], CertificateVerify
<<< ??? [length 0005]
<<< TLS 1.3 [length 0001]
<<< TLS 1.3, Handshake [length 0024], Finished

Is this relevant somehow? This difference leads me to believe that the server may actually be misconfigured somehow but openssl managed to connect with strict TLS 1.3 connection, which doesn't support this conclusion. Could there be some deeper bug with OpenSSL gem? We could probably test further by testing with JRuby but I do not know how to override OpenSSL options when creating request via RestClient or net/http

rhenium commented 4 months ago

Hey @rhenium, I had one more question. I ran this to disable middlebox compatibility, and it still works

openssl s_client -connect payments.cat.uk.pt-x.com:443 -tls1_3 -no_middlebox -strict

This doesn't work for me. As soon as I send some payload (such as "GET / HTTP/1.0\r\n\r\n"), the TCP connection is reset. This doesn't happen with the middlebox compatibility mode enabled, so it's likely the server (or some sort of load balancer/firewall in front of it) has an issue.

Currently you can manually set OpenSSL::SSL::OP_ENABLE_MIDDLEBOX_COMPAT into the OpenSSL::SSL::SSLContext instance which is used in your HTTP client.

I compared 1.3 and 1.2 and it seems that in TLS 1.3, we don't get a SSL session ticket (I hope this is what it's called).

This is completely normal with TLS 1.3 because a server may send a new session ticket at any time.

Abhishek-Bhatta commented 4 months ago

Amazing. I will speak to vendor to look into this. Thank you, I have learned a lot from this.