sfackler / rust-native-tls

Apache License 2.0
470 stars 195 forks source link

TLS negotiation fails when `native_tls` on Windows tries to connect to an OpenSSL 3 server #242

Closed argv-minus-one closed 1 year ago

argv-minus-one commented 1 year ago

I want to use native-tls in a client-side program to talk to an OpenSSL server. For the most part, this works fine, except when:

Under these circumstances, the native-tls-based client slowly allocates up to 4GB of memory, then hangs.

Here's a GitHub repository with a simple test program that demonstrates the problem. To use it:

  1. Make sure the openssl command is available and is OpenSSL 3 (check openssl version). On Windows, I've tried with the FireDaemon binaries, which demonstrate the problem. (The test program works correctly with FireDaemon's OpenSSL 1.1.1 build; it only fails with OpenSSL 3.)
  2. Check out the linked GitHub repository and cd into it.
  3. cargo run --bin openssl_server -- 127.0.0.1:4433 - this generates some certificates and then runs openssl s_server.
  4. Wait for the server to start. It will say ACCEPT when it's ready.
  5. In another terminal, cargo run --bin native_tls_client -- 127.0.0.1:4433 - this starts up a native-tls client that tries to connect to the server and send it an HTTP request. If it succeeds, it writes the response to stdout.

If you do this on Linux, it works fine. If you do this on Windows and the server uses OpenSSL 1.1.1, it works fine. If you do this on Windows and the server uses OpenSSL 3, the client hangs.

I noticed that the curl build that comes with Windows, which I think also uses the SChannel API, works fine even if the server is running OpenSSL 3.

I ran this in a debugger and found that the 4GB memory allocation is being done by the schannel crate. Somehow, when schannel::tls_stream::TlsStream::step_initialize calls InitializeSecurityContextW, the latter function sets inbufs[1].cbBuffer to just under u32::MAX (in my debugger right now, it's 4294966277). TlsStream::read_in then allocates and initializes that much memory, which on my machine takes a minute or two, and then hangs trying to read that many bytes from the socket.

So, maybe the issue is with the schannel crate rather than native-tls, but I'm not sure, so I thought I'd post an issue here first and see what you think.

sfackler commented 1 year ago

Yeah, that problem sounds like it's coming from the schannel crate in particular.

argv-minus-one commented 1 year ago

As of the Windows 2022-11 cumulative update, this bug seems to have disappeared. The test program now works correctly on both of the Windows 10 machines I've tried running it on. Looks like Microsoft fixed it, so I'll go ahead and close this.