tokio-rs / tls

A collection of Tokio based TLS libraries.
https://tokio.rs
MIT License
253 stars 86 forks source link

TLS handshake stuck in weak network environment #120

Closed mokeyish closed 1 year ago

mokeyish commented 1 year ago

The origin issue: https://github.com/bluejekyll/trust-dns/issues/1819

The example cloudflare-dns.com handshake get stuck, while dns.google is success. Now, I also reproduce it in rustls.

  1. change file https://github.com/rustls/rustls/blob/main/examples/src/bin/simpleclient.rs as following;

use std::str::FromStr; /// This is the simplest possible client using rustls that does something useful: /// it accepts the default configuration, loads some root certs, and then connects /// to google.com and issues a basic HTTP request. The response is printed to stdout. /// /// It makes use of rustls::Stream to treat the underlying TLS connection as a basic /// bi-directional stream -- the underlying IO is performed transparently. /// /// Note that unwrap() is used to deal with networking errors; this is not something /// that is sensible outside of example code. use std::sync::Arc; use std::fmt; use std::convert::TryInto; use std::net::{TcpStream, SocketAddr};

use rustls::{OwnedTrustAnchor, RootCertStore};

use tracing::{Event, Subscriber}; use tracing_subscriber::{ fmt::{format, FmtContext, FormatEvent, FormatFields, FormattedFields}, prelude::__tracing_subscriber_SubscriberExt, registry::LookupSpan, util::SubscriberInitExt, };

fn main() {

logger();

// cloudflare handshake failed
let addr = SocketAddr::from_str("1.1.1.1:853").unwrap();
let domain = "cloudflare-dns.com".to_string();

// google handshake success
// let addr = SocketAddr::from_str("8.8.8.8:853").unwrap();
// let domain = "dns.google".to_string();

// -----------------------------------------------

let mut root_store = RootCertStore::empty();
root_store.add_server_trust_anchors(
    webpki_roots::TLS_SERVER_ROOTS
        .0
        .iter()
        .map(|ta| {
            OwnedTrustAnchor::from_subject_spki_name_constraints(
                ta.subject,
                ta.spki,
                ta.name_constraints,
            )
        }),
);
let config = rustls::ClientConfig::builder()
    .with_safe_defaults()
    .with_root_certificates(root_store)
    .with_no_client_auth();

let server_name = domain.as_str().try_into().unwrap();

let mut sock = TcpStream::connect(addr).unwrap();

// uncomment to make it handshake successful.
// std::thread::sleep(std::time::Duration::from_millis(1500));

let mut conn = rustls::ClientConnection::new(Arc::new(config), server_name).unwrap();

conn.complete_io(&mut sock).unwrap();

let is_handshaking = conn.is_handshaking();
let wants_write = conn.wants_write();
let wants_read = conn.wants_read();

println!("###################### is_handshaking:{}, wants_write:{}, wants_read:{}", is_handshaking, wants_write, wants_read);

}

fn logger() { // Setup tracing for logging based on input

let formatter = tracing_subscriber::fmt::layer().event_format(TdnsFormatter);

tracing_subscriber::registry().with(formatter).init();

}

struct TdnsFormatter;

impl<S, N> FormatEvent<S, N> for TdnsFormatter where S: Subscriber + for<'a> LookupSpan<'a>, N: for<'a> FormatFields<'a> + 'static, { fn formatevent( &self, ctx: &FmtContext<', S, N>, mut writer: format::Writer<'>, event: &Event<'>, ) -> fmt::Result { // Format values from the event's's metadata: let metadata = event.metadata(); write!(&mut writer, "{}:{}", metadata.level(), metadata.target())?;

    if let Some(line) = metadata.line() {
        write!(&mut writer, ":{}", line)?;
    }

    // Format all the spans in the event's span context.
    if let Some(scope) = ctx.event_scope() {
        for span in scope.from_root() {
            write!(writer, ":{}", span.name())?;

            let ext = span.extensions();
            let fields = &ext
                .get::<FormattedFields<N>>()
                .expect("will never be `None`");

            // Skip formatting the fields if the span had no fields.
            if !fields.is_empty() {
                write!(writer, "{{{}}}", fields)?;
            }
        }
    }

    // Write fields on the event
    write!(writer, ":")?;
    ctx.field_format().format_fields(writer.by_ref(), event)?;

    writeln!(writer)
}

}


2. add dependencies, so we can see the handshaking logs.
```toml
tracing = "0.1.30"
tracing-subscriber = { version = "0.3", features = ["std", "fmt", "env-filter"] }
  1. stuck here, and can't get server hello

图片

Acctually, we can use command openssl to see, handshake successfull!

图片

mokeyish commented 1 year ago

@djc hi, I also reproduce in https://github.com/tokio-rs/tls, but currently, I din't known where to add a delay, let it handshake success. you can see the origin issue https://github.com/bluejekyll/trust-dns/issues/1819

My docker container can reproduce it.

djc commented 1 year ago

@mokeyish did you figure out your issue?

mokeyish commented 1 year ago

@mokeyish did you figure out your issue?

@djc no, but I also reproduce it in rustls.

Maybe it is the problem with rust std library.

When I sleep 1500 milliseconds after create std::net::TcpStream, it will also handshake success.

图片

djc commented 1 year ago

That makes no sense. The rustls library doesn't do any I/O, and if the connect() is slow that likely still points towards a problem with tokio-rustls.

mokeyish commented 1 year ago

https://drive.google.com/drive/folders/1Cc-X_GsbhPZgq7F88vhYaDweop3oHZf4 Here is the packet I captured. please use filter: (ip.dst eq 1.1.1.1 and tcp.dstport eq 853) or (ip.src eq 1.1.1.1 and tcp.srcport eq 853)

The file success.pcapng is success because I add 1500 milliseconds sleeping. The file stuck.pcapng just comment the sleeping code.

djc commented 1 year ago

I don't have time to study the pcaps, sorry. I recommend you dig into <MidHandshake<IS> as Future>::poll() and Stream<'a, IO, C>::handshake() to figure out the flow in the failing case.

mokeyish commented 1 year ago

I don't have time to study the pcaps, sorry. I recommend you dig into <MidHandshake<IS> as Future>::poll() and Stream<'a, IO, C>::handshake() to figure out the flow in the failing case.

I see the client send same data to server though the wireshark, but if I set no sleeping, the server never send Server Hello, so it get stuck at read().

mokeyish commented 1 year ago
图片
djc commented 1 year ago

These screenshots have different protocol versions for the ClientHello, so they're not the same code, right?

mokeyish commented 1 year ago

These screenshots have different protocol versions for the ClientHello, so they're not the same code, right?

图片

The packet data are totally same(except some random number). The client hello shows TLSv1.3 because it handshake sccuess and use tlsv1.3 fanally.

The succcess one just add sleeping after create TcpStream.

mokeyish commented 1 year ago

This is not a bug of rustls, native-tls also reproduces.

native-tls reproduce code:


extern crate native_tls;

use native_tls::TlsConnector;
use std::io::{Read, Write};
use std::net::{TcpStream, SocketAddr};
use std::str::FromStr;

fn main() {
    println!("start ");
    let addr = SocketAddr::from_str("1.1.1.1:853").unwrap();
    let domain = "cloudflare-dns.com".to_string();

    let connector = TlsConnector::new().unwrap();

    let stream = TcpStream::connect(addr).unwrap();

    std::thread::sleep(std::time::Duration::from_millis(1500));

    let mut stream = connector.connect(domain.as_str(), stream).unwrap();

    stream.write_all(b"GET / HTTP/1.0\r\n\r\n").unwrap();
    let mut res = vec![];
    stream.read_to_end(&mut res).unwrap();
    println!("{}", String::from_utf8_lossy(&res));
}
mokeyish commented 1 year ago

Final solution: not send clear text SNI.