rust-lang / socket2

Advanced configuration options for sockets.
https://docs.rs/socket2
Apache License 2.0
680 stars 221 forks source link

Haiku test failure. keepalive / connect_timeout/unbound #85

Open kallisti5 opened 4 years ago

kallisti5 commented 4 years ago

Just reporting a test failure on Haiku. Overall it's pretty close!

master as of 32969e53c75849e4047233d8b55f551a8645b21d

/Data/socket2-rs> cargo test
  Downloaded remove_dir_all v0.5.3
  Downloaded rand v0.4.6
  Downloaded tempdir v0.3.7
  Downloaded 3 crates (97.1 KB) in 0.93s
   Compiling remove_dir_all v0.5.3
   Compiling rand v0.4.6
   Compiling tempdir v0.3.7
   Compiling socket2 v0.3.12 (/Data/socket2-rs)
    Finished test [unoptimized + debuginfo] target(s) in 5.27s
     Running target/debug/deps/socket2-d90163a26b657973

running 13 tests
test socket::test::connect_timeout_unbound ... FAILED
test socket::test::connect_timeout_unrouteable ... ok
test socket::test::connect_timeout_valid ... ok
test socket::test::keepalive ... FAILED
test socket::test::nodelay ... ok
test socket::test::tcp ... ok
test sys::test_ip ... ok
test tests::domain_fmt_debug ... ok
test tests::domain_for_address ... ok
test tests::protocol_fmt_debug ... ok
test tests::socket_address_ipv4 ... ok
test tests::socket_address_ipv6 ... ok
test tests::type_fmt_debug ... ok

failures:

---- socket::test::connect_timeout_unbound stdout ----
thread 'main' panicked at 'unexpected success', src/socket.rs:925:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- socket::test::keepalive stdout ----
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: -2147483643, kind: InvalidInput, message: "Invalid Argument" }', src/socket.rs:985:9

failures:
    socket::test::connect_timeout_unbound
    socket::test::keepalive

test result: FAILED. 11 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out

error: test failed, to rerun pass '--lib'
Thomasdezeeuw commented 4 years ago

@alexcrichton is there anyone we can ping for Haiku support?

alexcrichton commented 4 years ago

I am unaware of someone myself.

Thomasdezeeuw commented 4 years ago

This is a bit out of the blue, but @jessicah you authored commit 7d3be9b417b6dead1ff31466bec5e0fe901641f4 which involves Haiku, could you help with this issue?

jessicah commented 4 years ago

@kallisti5 is actually one of the core Haiku maintainers...

Regarding the keepalive test, is c_int dependent on the OS arch? I.e. 32-bit on 32-bit arch, 64-bit on 64-bit arch? If so, then this is likely the problem.

Keepalive: https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L839 Setsockopt: https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L885 Haiku expecting a 32-bit integer: https://github.com/haiku/haiku/blob/5c63c64bfd2cebc9010219266f2cab3d510623b3/src/add-ons/kernel/network/stack/net_socket.cpp#L1605

if (length != sizeof(int32))
  return B_BAD_VALUE;

As for the connect timeout unbound, I'm not sure, but sounds like a bug on Haiku's side, with the connection lingering after the socket has been dropped.

Thomasdezeeuw commented 4 years ago

@kallisti5 is actually one of the core Haiku maintainers...

Ah, I didn't know that.

Regarding the keepalive test, is c_int dependent on the OS arch? I.e. 32-bit on 32-bit arch, 64-bit on 64-bit arch? If so, then this is likely the problem.

I didn't check the spec, but I think it only needs to be at least 16 bits. Even on most 64 bit architectures its actually 32 bit (this is at least true for MacOS and Linux).

Keepalive:

https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L839

Setsockopt:

https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L885

Haiku expecting a 32-bit integer: https://github.com/haiku/haiku/blob/5c63c64bfd2cebc9010219266f2cab3d510623b3/src/add-ons/kernel/network/stack/net_socket.cpp#L1605

if (length != sizeof(int32))
  return B_BAD_VALUE;

Is this true for all setsockopt options on Haiku? If so its seems a relative easy fix: replacing as libc::socklen_t with the correct type in: https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L885.

As for the connect timeout unbound, I'm not sure, but sounds like a bug on Haiku's side, with the connection lingering after the socket has been dropped.

@kallisti5 could you look further into that yourself?

kallisti5 commented 4 years ago

@kallisti5 is actually one of the core Haiku maintainers...

Doh!!!! I wanted to see if anyone else would step up to do the work.

I'll take a look. Thanks for the analysis Jessica :-)

jessicah commented 4 years ago

Is this true for all setsockopt options on Haiku?

No, a couple options can receive a struct as argument, struct linger & struct timeval.

Thomasdezeeuw commented 3 years ago

@kallisti5 @jessicah is there any way we can setup CI for Haiku (see #78)?

kallisti5 commented 3 years ago

Hi! We do actually have package Haiku builder nodes, but they're leveraging our native ports (haikuporter) system. We do support running tests however in it: https://github.com/haikuports/haikuports/blob/master/dev-lang/openjdk/openjdk14-14.0.2.12.recipe#L238

What CI system are you using? Haiku has full virtio support, and runs fine in cloud environments. I could go to Haiku, Inc. to see if we'll fund a dedicated full-time x86_64 Haiku vm for CI/CD of rust stuff.

Thomasdezeeuw commented 3 years ago

@kallisti5 currently we're using GitHub Actions. I don't think we need anything dedicated, currently GitHub Actions completes in ~1 min. libc seems to use Docker https://github.com/rust-lang/libc/tree/master/ci, does Haiku run in that? Issue #78 tracks CI support, if there is anything you can do that would be great and we can keep socket2 for Haiku working.

kallisti5 commented 3 years ago

Hi!

Haiku's not Linux (we have our own kernel, syscalls, and standard libraries), so we really can't run in containers.

Haiku easily boots in qemu and other full emulation and is fully POSIX compliant with an SSH server running for remote command execution.

There are working vagrant boxes for Haiku, and we have a pretty rich software ecosystem... python, ruby, (early) Rust, (early) Golang support, c, clang, llvm, etc etc

Here is our full searchable software port list: https://depot.haiku-os.org

Thomasdezeeuw commented 3 years ago

@kallisti5 libc also seems to run NetBSD and Redox in docker, that's why I asked. Do you know of an example Rust project that uses QEMU (or something else) to test on Haiku from which we can setup our own CI for Haiku?

kallisti5 commented 3 years ago

ooh.. that's new. Let me see what "redoxer" is doing. It would be super handy if we could do something similar

kallisti5 commented 3 years ago

Latest results under master as of f7023b4c810eb7b0fedf23a1752461c41765c797

/Data/Code/socket2-rs> cargo test
    Finished test [unoptimized + debuginfo] target(s) in 0.01s
     Running unittests (target/debug/deps/socket2-7f77f3cf6f1ffd13)

running 4 tests
test sockaddr::ipv4 ... ok
test sockaddr::ipv6 ... ok
test sys::in6_addr_convertion ... ok
test sys::in_addr_convertion ... ok

test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/socket.rs (target/debug/deps/socket-9c38dbe7041dcda2)

running 31 tests
test broadcast ... ok
test common_flags ... ok
test connect_timeout_unbound ... ok
test connect_timeout_unrouteable ... ok
test connect_timeout_valid ... ok
test domain_fmt_debug ... ok
test domain_for_address ... ok
test from_invalid_raw_fd_should_panic ... ok
test keepalive ... ok
test linger ... FAILED
test niche ... ok
test no_common_flags ... ok
test nodelay ... ok
test only_v6 ... FAILED
test out_of_band ... FAILED
test out_of_band_inline ... ok
test protocol_fmt_debug ... ok
test r#type ... ok
test read_timeout ... ok
test recv_buffer_size ... ok
test recv_from_vectored_truncated ... ok
test recv_vectored_truncated ... ok
test reuse_address ... ok
test send_buffer_size ... ok
test send_from_recv_to_vectored ... ok
test send_recv_vectored ... ok
test set_nonblocking ... ok
test tcp_keepalive ... FAILED
test ttl ... ok
test type_fmt_debug ... ok
test unicast_hops_v6 ... ok

failures:

---- linger stdout ----
thread 'main' panicked at 'failed to get initial value: Os { code: -2147454942, kind: Other, message: "Protocol option not available" }', tests/socket.rs:1123:1

---- only_v6 stdout ----
thread 'main' panicked at 'failed to get initial value: Os { code: -2147454933, kind: Other, message: "Operation not supported" }', tests/socket.rs:1138:1

---- out_of_band stdout ----
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `12`,
 right: `1`', tests/socket.rs:495:5

---- tcp_keepalive stdout ----
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: -2147483643, kind: InvalidInput, message: "Invalid Argument" }', tests/socket.rs:717:39

failures:
    linger
    only_v6
    out_of_band
    tcp_keepalive

test result: FAILED. 27 passed; 4 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.27s

error: test failed, to rerun pass '--test socket'

still looking at redoxer, it's not - not complex :-)

link2xt commented 3 years ago

In tcp_keepalive the problem is this line: https://github.com/rust-lang/socket2/blob/1c67209c2eda5708b3093900a59befbc4a981b13/src/sys/unix.rs#L881

This if corresponds to the case when keepalive.time is set. keepalive.time is the time the connection has to be idle before keepalives are sent. There is no such option on Haiku at all. The only option on Haiku related to keepalive is the call setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, 0) to disable keepalives and setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, 1) to enable it. Same for OpenBSD by the way, except that on OpenBSD you can control global setting net.inet.tcp.keepidle via sysctl. But per socket on OpenBSD and Haiku you can only enable/disable keepalives without controlling any constants.

In the line quoted above on both Haiku and OpenBSD the following call is issued:

setsockopt(fd, IPPROTO_TCP, SO_KEEPALIVE, seconds)

which is totally wrong as IPPROTO_TCP level has no SO_KEEPALIVE option, and SO_KEEPALIVE option accepts boolean, not the number of seconds.

By the way on Linux this call only sets the number of seconds, but does not enable keepalives. You still need to do setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, 1) on Linux in addition to setting the time. Currently the code is broken: it tweaks keepalives, but does not enable them.

So overall the whole function set_tcp_keepalive should be rewritten to:

  1. Call setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, keepalive.time.is_some()) on all systems supporting it: Haiku, OpenBSD, Linux, etc. I think enabling/disabling keepalives is supported on all operating systems. This is already done in https://github.com/rust-lang/socket2/blob/1c67209c2eda5708b3093900a59befbc4a981b13/src/socket.rs#L1539
  2. Call setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, keepalive.time.unwrap()) (of course use if let, not unwrap) on Linux and other operating systems supporting it. On MacOS for some reason this option is called TCP_KEEPALIVE, see https://www.unix.com/man-page/mojave/4/tcp/
  3. Keep the code setting TCP_KEEPINTV and TCP_KEEPCNT as is, this seems to be correct.

I will probably make a PR with a fix myself so you don't have to read all these notes and redo figuring this out. Haiku has no documentation it seems, I figured out it does not have TCP options and how to use setsockopt there by grepping their source.

Update: made a PR #251