JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.45k stars 5.46k forks source link

TCP [RST] intermittently ignored #25314

Closed samoconnor closed 7 months ago

samoconnor commented 6 years ago

Most of the time when TCPSocket receives a [RST] packet, libuv calls uv_readcb() and UVError, ECONNRESET is thrown.

I have a test case where hundreds of pipelined HTTP PUT Requests are sent to AWS S3. Typically the Requests get ahead of the Responses (e.g. when Request No. 70 is being sent, we may only be up to reading Response No. 10). At some point the S3 server hits an internal limit on the number of Requests per connection (about 100) and stops sending Response data (e.g. we might send Request No. 120 and then while we're reading Response No. 30 data stops arriving. Sometimes the server sends [RST] right away and a UVError, ECONNRESET is thrown as expected. Note: the S3 doc suggests not to send more than 90 requests per connection. I'm sending more than that as a way to test corner case behaviour in HTTP.jl.

However, monitoring with wireshark shows that sometimes the [RST] is not sent for a few minutes. It seems that in this case libuv does not notice the [RST], and uv_readcb is not called. The result is that the eof() call that the reader is waiting for blocks forever. I have a seperate task that periodically prints connection debug info. This shows that the LibuvStream.state remains StatusActive.

I have tried putting lots of printfs in libuv. What I see is that the uv__stream_io function is not called at all in the case where the [RST] is missed. Maybe there is a race-condition inside libuv where the [RST] is missed if kevent is not active when it arrives? Maybe for some reason libuv forgets to submit the socket to kevent, or does not indicate interest in the correct event type? (I'm not familiar with kqueue).

I have tried modifying wait_readnb so that it wakes up and does uv_read_start again every so often while waiting. This makes no difference.

As a practical solution for HTTP.jl I've implemented a Retry Layer that uses a seperate task to close stuck connections. Calling close results in the blocked eof() task waking up, discovering the connection is gone, and retring the Request.

Version 0.7.0-DEV.3090 (2017-12-18 19:26 UTC)
Commit 5abe9b1382* (10 days old master)
x86_64-apple-darwin14.5.0
samoconnor commented 6 years ago

This issue (and this one: https://github.com/JuliaLang/julia/issues/14747) make me wonder if libuv is the best way to implement socket IO in Julia.

Microsoft now has their own implementation of epoll, AF_INET, AF_UNIX and AF_NETLINK. I believe that this is not directly accessible from a windows .exe or .dll but, perhaps there is some way around this.

Perhaps it would be better for Julia's network IO layer to be built on BSD sockets + epoll/kevent and use something like WSL to provide compatibility with windows.

It is frustrating to spend time figuring out what libuv is doing when debugging Julia IO stuff. The libuv documentation is thin and often says "See the linux man page for more". It often feels like it would be easier to work directly with the well defined BSD/Linux APIs that I know.

Moving the main event loop from libuv to Julia might also help with other event related stuff: #22631 #13763.

vtjnash commented 6 years ago

Microsoft now has their own implementation of epoll, AF_INET, AF_UNIX and AF_NETLINK. I believe that this is not directly accessible from a windows .exe or .dll but, perhaps there is some way around this.

They've had it for years – it's used by libuv. WSL is entirely tangential to this; the relevant subsystem is whether the underlying driver being used is WSK. The old API (which did not support epoll) was deprecated in Windows 7, although I know of at least one corporate firewall that tries to prevent user programs from accessing the new subsystem (as of a couple years ago when I last checked).

samoconnor commented 6 years ago

WSK implements epoll ?, or do you mean that IoCompletionPort is similar to epoll?

vtjnash commented 6 years ago

Usually it just use IOCP, since presumably that's faster. But it looks like the original PoC repo has even been getting new updates recently https://github.com/piscisaureus/wepoll

vtjnash commented 3 years ago

Can you provide any update information here?