compio-rs / compio

A thread-per-core Rust runtime with IOCP/io_uring/polling.
MIT License
420 stars 37 forks source link

Fix eagain on linux #43

Closed DXist closed 1 year ago

DXist commented 1 year ago

I've found the following patch - https://lore.kernel.org/all/a545c1ae-02a7-e7f1-5199-5cd67a52bb1e@kernel.dk/T/#mc720f37164873a8853bb1953515329c242a89a3b

Using blocking socket allows tests to pass on Linux 5.15.

Berrysoft commented 1 year ago

Which bug do you want to solve in this PR? Could you provide a MWE?

I cannot find errors when running the tests with kernel 5.15.

$ cat /proc/version
Linux version 5.15.90.1-microsoft-standard-WSL2 (oe-user@oe-host) (x86_64-msft-linux-gcc (GCC) 9.3.0, GNU ld (GNU Binutils) 2.34.0.20200220) #1 SMP Fri Jan 27 02:56:13 UTC 2023
DXist commented 1 year ago

I get the following with the latest stable Docker Desktop for Mac:

#15 [test 1/1] RUN --mount=type=cache,target=/usr/local/cargo/registry,uid=1000,gid=1000     --mount=type=cache,target=/usr/local/cargo/git,uid=1000,gid=1000     --mount=type=cache,target=/build/target,uid=1000,gid=1000     cargo test --features all --test tcp_connect -- --nocapture --skip _v6
#15 0.839    Compiling compio v0.5.0 (/build)
#15 4.007     Finished test [unoptimized + debuginfo] target(s) in 3.61s
#15 4.011      Running tests/tcp_connect.rs (target/debug/deps/tcp_connect-fcec66faf5576498)
#15 4.014
#15 4.014 running 7 tests
#15 4.016 test connect_invalid_dst ... ok
#15 4.016 thread 'ip_port_tuple' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', tests/tcp_connect.rs:49:33
#15 4.017 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
#15 4.017 thread 'thread 'ip_string' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', tests/tcp_connect.rs:49:33
#15 4.018 ip_str' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', tests/tcp_connect.rs:49:33
#15 4.019 thread 'ip_port_tuple_ref' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', tests/tcp_connect.rs:49:33test ip_str ... FAILED
#15 4.019 test ip_port_tuple ...
#15 4.019 FAILED
#15 4.019 thread 'ip_str_port_tuple' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }test ip_string ... FAILED

Docker Engine uses linuxkit kernel:

docker run -it --rm --privileged --pid=host justincormack/nsenter1
~ # cat /proc/version
Linux version 5.15.49-linuxkit-pr (root@buildkitsandbox) (gcc (Alpine 10.2.1_pre1) 10.2.1 20201203, GNU ld (GNU Binutils) 2.35.2) #1 SMP Thu May 25 07:17:40 UTC 2023
Berrysoft commented 1 year ago

Sounds interesting. Let me see...

Berrysoft commented 1 year ago

It's a kernel bug, and has been fixed in the newer kernel versions, right? I think it's not our work to keep compatible with an old, buggy kernel.

DXist commented 1 year ago

There is no newer stable alternative for MacOS yet.

Berrysoft commented 1 year ago

So it's a docker issue then.

DXist commented 1 year ago

For IO uring the driver is expected to block if no completions is available. So not setting O_NONBLOCK is consistent with this intention. It's up to the kernel implementation to decide how to proceed - block a kernel worker thread or rearm poller and process EAGAIN internally.

An older kernel commit - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.8&id=e697deed834de15d2322d0619d51893022c90ea2

Berrysoft commented 1 year ago

OK, agreed.