quininer / tokio-rustls

Asynchronous TLS/SSL streams for Tokio using Rustls.
142 stars 38 forks source link

EarlyData and multiple writes hangs connection #55

Closed daareiza closed 4 years ago

daareiza commented 4 years ago

Hello guys,

I was trying to use trust-dns-client with the trust-dns-rustls library to query some DNS servers that supports TLS, especially one that has 0-RTT support.

The thing is, when I tried to resolve some DNS queries everything worked fine, but only for the first query where the early data flag is still not available for the connection, but after it becomes available (2 queries+) it just hangs and never receives a response from the server.

I tracked than the issue and I found that trust-dns-rustls uses this library to wrap all rustls stuff and when early data is available/enabled it just stops resolving queries after the first one.

I am very new to rust so I had a really hard time trying to figure out what was happening (specially because of the tokio polling stuff), but at the end I added this line and everything started working as expected but I am not 100% sure about that fix but for now it works for me.

So, the issue happens when the DNS client tries to send multiple messages on a single packet (early data), so the poll_write function is called multiple times (or that is what I think), the message in fact is composed of: (after client hello) -> Change Cipher Spec + Application Data + Application Data.

If it helps of anything, this is just the flow from the tokio-rustls client for two DNS requests, probably it could help with further debugging:

Normal TLS flow (early data disabled)

POLL
POLL READ, State: Stream
POLL WRITE, State: Stream
POLL WRITE, State: Stream
POLL FLUSH, state: Stream
POLL FLUSH, new state: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream

POLL
POLL READ, State: Stream
POLL WRITE, State: Stream
POLL WRITE, State: Stream
POLL FLUSH, state: Stream
POLL FLUSH, new state: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream

Failed TLS flow (early data enabled)

POLL
POLL READ, State: Stream
POLL WRITE, State: Stream
POLL WRITE, State: Stream
POLL FLUSH, state: Stream
POLL FLUSH, new state: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream

POLL
POLL READ, State: EarlyData
POLL WRITE, State: EarlyData
    ED 1
    ED 1a, pos: 0, data: []
    ED 2, buf len: 2
    ED 2b, len: 2, buf len: 2
POLL WRITE, State: EarlyData
    ED 1
    ED 1a, pos: 0, data: [0, 47]
    ED 2, buf len: 47
    ED 2b, len: 47, buf len: 47
POLL FLUSH, state: EarlyData
POLL FLUSH, state: EarlyData
POLL FLUSH, state: EarlyData
POLL FLUSH, new state: EarlyData
POLL READ, State: EarlyData
POLL READ, State: EarlyData

-- HANGS, just continues until the server drops the connection --

POLL READ, State: EarlyData

Fixed TLS flow (early data enabled)

POLL
POLL READ, State: Stream
POLL WRITE, State: Stream
POLL WRITE, State: Stream
POLL FLUSH, state: Stream
POLL FLUSH, new state: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream

POLL
POLL READ, State: EarlyData
POLL WRITE, State: EarlyData
    ED 1
    ED 1a, pos: 0, data: []
    ED 2, buf len: 2
    ED 2b, len: 2, buf len: 2
POLL WRITE, State: EarlyData
    ED 1
    ED 1a, pos: 0, data: [0, 47]
    ED 2, buf len: 47
    ED 2b, len: 47, buf len: 47
POLL FLUSH, state: EarlyData
POLL FLUSH, state: EarlyData
POLL FLUSH, state: EarlyData
POLL FLUSH, new state: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream
POLL READ, State: Stream

-- Success!!!
quininer commented 4 years ago

Thank you for your report! I found this problem some time ago. if the length of 0-RTT message is too short, this may cause the read alaway pending.

Since openssl s_server does not support the simultaneous use of -early_data and -www, the test failed to override this use case.

I believe it has been fixed by https://github.com/quininer/tokio-rustls/commit/872510bd65949afff0c76b9218c0c8db8263e7d5 and https://github.com/quininer/tokio-rustls/commit/ba909ed95ea7352cba17ce0493d3aff0253e609f.

daareiza commented 4 years ago

@quininer thanks man!!!, it seems to work on my side, no more hangs now