Closed samoconnor closed 6 years ago
Thankyou @quinnj for reviewing #175 and #167 so quickly! 🙏
Yeah, I started to look over #177 and #178 , but I'll do a more in-depth review tonight (6 hours or so)
I've been running MbedTLS + HTTP master in a dev environment for a couple hours now and it's been looking good. I did see this error/backtrace on (I believe) the first https request that was made:
2018-10-16T18:21:34.672023325Z ┌ Warning: Error in @async writebody task
2018-10-16T18:21:34.672084004Z │ exception =
2018-10-16T18:21:34.672112144Z │ ArgumentError: `unsafe_write` requires `iswritable(::SSLContext)`
2018-10-16T18:21:34.672123265Z │ Stacktrace:
2018-10-16T18:21:34.672132706Z │ [1] ssl_unsafe_write(::MbedTLS.SSLContext, ::Ptr{UInt8}, ::UInt64) at /home/domorun/.julia/packages/MbedTLS/cXN47/src/ssl.jl:194
2018-10-16T18:21:34.672142684Z │ [2] unsafe_write at /home/domorun/.julia/packages/MbedTLS/cXN47/src/ssl.jl:371 [inlined]
2018-10-16T18:21:34.672161296Z │ [3] unsafe_write(::HTTP.Streams.Stream{HTTP.Messages.Response,HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}}, ::Ptr{UInt8}, ::UInt64) at /home/domorun/.julia/packages/HTTP/nV2Zw/src/Streams.jl:96
2018-10-16T18:21:34.672171419Z │ [4] unsafe_write at ./io.jl:509 [inlined]
2018-10-16T18:21:34.672180571Z │ [5] macro expansion at ./gcutils.jl:87 [inlined]
2018-10-16T18:21:34.672196736Z │ [6] write at ./io.jl:532 [inlined]
2018-10-16T18:21:34.672206384Z │ [7] writebody(::HTTP.Streams.Stream{HTTP.Messages.Response,HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext}}, ::HTTP.Messages.Request, ::Array{UInt8,1}) at /home/domorun/.julia/packages/HTTP/nV2Zw/src/StreamRequest.jl:100
2018-10-16T18:21:34.672220741Z │ [8] (::getfield(HTTP.StreamRequest, Symbol("##2#3")){HTTP.ConnectionPool.Transaction{MbedTLS.SSLContext},HTTP.Messages.Request,Array{UInt8,1}})() at ./task.jl:259
2018-10-16T18:21:34.672230712Z └ @ HTTP.StreamRequest ~/.julia/packages/HTTP/nV2Zw/src/StreamRequest.jl:60
Not sure if we're maybe running into some kind of first-run-compilation race condition or something? Or somehow compiling the "handshake" stack of functions somehow affects the actual handshake differently than it used to?
If we got up to writebody
then we must have been through startwrite
(headers) without issue.
Can you try adding this to HTTP.jl/src/IOExtras.jl ...
isioerror(e::ArgumentError) = e.msg == "stream is closed or unusable"
isioerror(e::ArgumentError) = e.msg == "`unsafe_write` requires `iswritable(::SSLContext)`"
... or perhaps this would be better:
--- a/src/ssl.jl
+++ b/src/ssl.jl
@@ -191,6 +191,8 @@ See `f_send` and `set_bio!` below.
"""
function ssl_unsafe_write(ctx::SSLContext, buf::Ptr{UInt8}, nbytes::UInt)
+ Base.check_open(ctx.bio)
+
iswritable(ctx) ||
throw(ArgumentError("`unsafe_write` requires `iswritable(::SSLContext)`"))
That way we should get the old stream is closed or unusable
message (and also be in sync with https://github.com/JuliaLang/julia/pull/29274).
The the "requires iswritable(::SSLContext)
" is still useful in the case where check_open
finds that the connection is open. In this case iswritable = false
means that someone explicitly called close
on the SSLContext
.
I've been running MbedTLS + HTTP master in a dev environment for a couple hours now and it's been looking good.
That is good news ! 🤖😬✅ 🎉
Base.check_open(ctx.bio)
added ...
master: https://github.com/JuliaWeb/MbedTLS.jl/commit/12e0100084d0bf0e59d78562dd8409ee2efbaa68
0.6: https://github.com/JuliaWeb/MbedTLS.jl/commit/76036da91ac03b7429c55f4bb1d8e5ba468b35cc
Fixed 1.0 compat on master: https://github.com/JuliaWeb/MbedTLS.jl/commit/0904bde166b5675a93a148d32b52ce2331f65b4e
@janfrancu reports that this resolves an issue with AWSS3.jl: https://github.com/JuliaWeb/HTTP.jl/issues/307#issuecomment-430987933
Ready to tag @quinnj ?
Yeah; with the testing I've been doing, I seem to see more IOErrors than before, but I believe these were cases that were probably getting lost/swallowed before.
We have had an issue of intermittent hang-ups when doing large volumes of AWS API calls through AWSCore and HTTP.jl.
One manifestation of the problem occurred after several HTTP Transactions with non-trivial size Response Body had already occurred on a connection. The server would send a final 5-byte
0\r\n\r\n
end-of-chunked-body marker. A tiny packet compared to the previous steady stream of body packets. Somehow or other this last packet was never seen by the TLS layer. They HTTP layer was waiting ineof(::TCPSocket)
but never waking up. It is possible that a Delayed ACK / Naggle interaction caused this last backed to be delayed a bit (Linux client).Somehow this timing difference seems to have resulted in a race-condition whereby eof() goes to sleep even though there is data waiting. Setting up an async task to trigger when the hang-up had gone on for 2 seconds allowed us to see that the data was, at that point available in the TLS buffer, event though right before going into eof() it was not there.
A while, back we added this
Base.start_reading
band-aid to try to keep libuv watching the socket so that we don't miss TLS close_notify messages when no one is reading from the connection: https://github.com/JuliaWeb/MbedTLS.jl/blob/7ee7901442c5d9d3b80088244977033a5f7a4ff9/src/ssl.jl#L186-L202It seems likely that some interaction between the
isopen(ctx)
call above and the going-into-eof code is causing the problem.There are a few bad smells around. First, the original design of the library didn't really account for OOB messages like TLS close_notify (and other internal TLS messages that client and server may send to each other). This is why we ended up with the
@async while isopen(ctx)
loop above, and is also the reason for some convoluted code to do zero-length reads (which pump the TLS library's internal state) before doing bytes available queries. Another aspect of this that seems not quite is the fact thateof(::SSLContext)
callseof(::TCPSocket)
directly. It all feels at this point like a bunch of work-arounds (many of them implemented by me in haste) for an initial assumption that there would be more of a 1-1 correspondence between encrypted data arriving and decrypted data being ready to read.PRs to follow...