swindon-rs / tk-http

Full featured HTTP and Websockets library for rust/tokio
Apache License 2.0
136 stars 10 forks source link

Windows 10: Connection error: I/O error: An established connection was aborted by the software in your host machine. #25

Open radix opened 7 years ago

radix commented 7 years ago

I am testing out using tk-http instead of Hyper, and I noticed I can quite easily get my code to crash with some low-level looking network error by holding down the F5 key in Firefox. I haven't been able to trigger an error like this with Hyper.

I'm running this code on Windows 10 with the MSVC toolchain.

To reproduce, checkout this revision of my pandt project. Then do the following in the checkout:

$ cd ptrpi
$ cargo run --bin tkrpi 1337 dd.yaml

Then load up http://localhost:1337/ in your browser and spam it until it crashes, which only takes a few seconds for me.

Here's a link to quickly see my mainloop: https://github.com/radix/pandt/commit/254dbed564318f1c83036b14b36239fee834361a#diff-23e1c8400fe6268f1a67705b9c60cde2R84

Connection error: I/O error: An established connection was aborted by the software in your host machine. (os error 10053)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ()', src\libcore\result.rs:860
stack backtrace:
   0:     0x7ff75d7cd847 - std::panicking::default_hook::{{closure}}
                        at C:\projects\rust\src\libstd\panicking.rs:356
   1:     0x7ff75d7ccd74 - std::panicking::default_hook
                        at C:\projects\rust\src\libstd\panicking.rs:367
   2:     0x7ff75d7d0691 - std::panicking::rust_panic_with_hook
                        at C:\projects\rust\src\libstd\panicking.rs:545
   3:     0x7ff75d7d0548 - std::panicking::begin_panic<collections::string::String>
                        at C:\projects\rust\src\libstd\panicking.rs:507
   4:     0x7ff75d7d0464 - std::panicking::begin_panic_fmt
                        at C:\projects\rust\src\libstd\panicking.rs:491
   5:     0x7ff75d7d03f9 - std::panicking::rust_begin_panic
                        at C:\projects\rust\src\libstd\panicking.rs:467
   6:     0x7ff75d7d93a7 - core::panicking::panic_fmt
                        at C:\projects\rust\src\libcore\panicking.rs:69
   7:     0x7ff75d5f0879 - core::result::unwrap_failed<()>
                        at C:\projects\rust\src\libcore\macros.rs:29
   8:     0x7ff75d5c9a99 - core::result::Result<(), ()>::unwrap<(),()>
                        at C:\projects\rust\src\libcore\result.rs:737
   9:     0x7ff75d6a9e93 - tkrpi::main
                        at C:\Users\radix\Projects\pandt\ptrpi\src\bin\tkrpi.rs:93
  10:     0x7ff75d7d1481 - panic_unwind::__rust_maybe_catch_panic
                        at C:\projects\rust\src\libpanic_unwind\lib.rs:98
  11:     0x7ff75d7d096a - std::rt::lang_start
                        at C:\projects\rust\src\libstd\rt.rs:51
  12:     0x7ff75d6abc6b - main
  13:     0x7ff75d7dfea8 - __scrt_common_main_seh
                        at f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:253
  14:     0x7ff90d36451c - BaseThreadInitThunk
error: process didn't exit successfully: `C:\Users\radix\Projects\pandt\target\debug\tkrpi.exe 1337 .\dd.yaml` (exit code: 101)
tailhook commented 7 years ago

Well, it's not the problem with tk-http per se. But with the example code that does accepting connections. buffer_unordered exists on first error of the future. Which effectively means your main loop exits with error on first error of the accept system call.

My real-life code for handling accept is along the lines of:

    core.run(listener.incoming()
        // we need stream that doesn't fail on error
        .then(move |item| match item {
            Ok(x) => Either::A(ok(Some(x))),
            Err(e) => {
                warn!("Error accepting: {}", e);
                /// pause stream
                Either::B(Timeout::new(listen_error_pause, &handle).unwrap()
                    .and_then(|()| ok(None)))
            }
        })
        .filter_map(|x| x)  /// filter out None produced by code dooing pause on error
        .map(move |(socket, saddr)| {
             Proto::new(socket, &hcfg, Handler::new(saddr, w2.clone()), &h1)
             // always succeed
             .then(|_| Ok(()))
        })
        .buffer_unordered(max_connections)
        .for_each(move |()| Ok(()))
        // `shutter` is a oneshot that we use to force-close the listener
        .select(shutter.then(move |_| Ok(())))
        // these basically execute only when shutter receives a message
        .map(move |(_, _)| info!("Listener {} exited", addr))
        .map_err(move |(_, _)| info!("Listener {} exited", addr))
    );

Probably you can also do whatever hyper does for listening, i.e. for_each instead of buffer_unordered and spawn for each connection. But the downside of the latter approach is that you have number of connections limited by OS limits rather than by your own limit. I.e. your server can exhaust number of incoming file descriptiors and your app will be unable to establish connections to backend, db, or even open some file.

(I might have to update the example...)

tailhook commented 7 years ago

Okay, so this piece of code seems to be quite large and useful not only for HTTP. I'm thinking to either put it into tk-easyloop or just create another crate like tk-listen (for client we have tk-pool likewise we can factor it out for server too). What do you think?

radix commented 7 years ago

@tailhook shouldn't that kind of logic be in tokio proto?

tailhook commented 7 years ago

Well, we don't use tokio-proto for protocol parsing for multiple reasons. So depending on it just for listen function doesn't look like a good idea. Also, the problem isn't solved in current tokio-proto too. They use for_each and spawn (so no limit). So tokio-proto might also benefit from separate library doing listening correctly.

radix commented 7 years ago

I see, sorry! I'm not yet familiar with the tokio ecosystem :)

tailhook commented 7 years ago

Wow! While refactoring this example to a separate library I've found out that this code is a little bit wrong. When listening there are few errors that require sleep and repeat (namely ENFILE and EMFILE), but there are also few errors that should not incur delay (namely ECONNREFUSED), because this means some bad-behaving user might use it to block other connections.

I'll do my best to publish correct code ASAP.

tailhook commented 7 years ago

So, I've published tk-listen crate. Probably, I'll update examples here to use the crate.

radix commented 7 years ago

Good stuff! I do hope that the larger Tokio community picks up on your crate, or maybe even integrates it into some other crate.