quinn-rs / quinn

Async-friendly QUIC implementation in Rust
Apache License 2.0
3.85k stars 394 forks source link

sendmmsg fails with EINVAL on Android #1511

Closed link2xt closed 1 year ago

link2xt commented 1 year ago

While trying to get quinn-based file transfer (https://github.com/deltachat/deltachat-core-rust/pull/4007) working on Android, we discovered that no UDP packets are sent.

I compiled a CLI application deltachat-repl with NDK and ran it in Termux under strace to find out what is happening. Apparently all sendmmsg calls failed with EINVAL error.

Example of the first sendmmsg call:

2228  sendmmsg(18,  <unfinished ...>                                  
2229  futex(0x78c180a568, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
2225  epoll_pwait(3,  <unfinished ...>           
2224  futex(0x78bfdf6068, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
2228  <... sendmmsg resumed>[{msg_hdr={msg_name={sa_family=AF_INET, sin_port=htons(38795), sin_addr=inet_addr("192.168.178.181")}, msg_namelen=16, msg_iov=[{iov_base="\311\0\0\0\1\24\326\340\2350\360\376\2206O\230Q\317\326\274P\256\321yG\333\10\6\350\237\266\36"..., iov_len=1200}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_IP, cmsg_type=IP_TOS, cmsg_data=[0x2, 0, 0, 0]}], msg_controllen=24, msg_flags=0}}], 1, 0) = -1 EINVAL (Invalid argument)

Normally, on Linux, sendmmsg call looks like this:

413989 sendmmsg(17,  <unfinished ...>
413985 <... epoll_wait resumed>[{events=EPOLLOUT, data={u32=0, u64=0}}], 1024, 987) = 1  
413985 epoll_wait(3,  <unfinished ...>                                                                                                                        
413989 <... sendmmsg resumed>[{msg_hdr={msg_name={sa_family=AF_INET, sin_port=htons(33499), sin_addr=inet_addr("192.168.178.181")}, msg_namelen=16, msg_iov=[{iov_base="\306\0\0\0\1\24\371~\254V<\27\27\203\247\201sY\362\325\233\357\323\377\304\2\10\27\\\327\375\n"..., iov_len=1200}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_IP, cmsg_type=IP_TOS, cmsg_data=[0x2, 0, 0, 0]}], msg_controllen=24, msg_flags=0}, msg_len=1200}], 1, 0) = 1

The difference I see is msg_len=1200 when running on Linux, but not on Android, but this is a return value, it is written by the syscall on success.

Sending UDP packets with ncat from Termux works, so it is not a problem with a general network permission.

link2xt commented 1 year ago

One problem I see is that some parts of quinn-udp/src/unix.rs are behind #[cfg(target_os = "linux")] without corresponding target_os = "android" branch.

Another difference is that in my case Linux is x86_64, while Android is aarch64.

Could there be an alignment problem with cmsg?

link2xt commented 1 year ago

Commenting out the whole body of Encoder::push in quinn-udp/src/cmsg.rs worked: https://github.com/quinn-rs/quinn/blob/02037e72fa6b7f994a2db411977bcffacbb8b02c/quinn-udp/src/cmsg.rs#L35

I guess it is a problem with cmsg alignment or something.

link2xt commented 1 year ago

The patch which makes it work for me:

commit 67419bac18a4900354964733b786236086824983 (HEAD -> no-cmsg)
Author: link2xt <link2xt@testrun.org>
Date:   Tue Mar 14 23:29:24 2023 +0000

    Comment out cmsg encoding

diff --git a/quinn-udp/src/cmsg.rs b/quinn-udp/src/cmsg.rs
index 2cd111e9..c6c9f2e8 100644
--- a/quinn-udp/src/cmsg.rs
+++ b/quinn-udp/src/cmsg.rs
@@ -33,6 +33,7 @@ impl<'a> Encoder<'a> {
     /// - If insufficient buffer space remains.
     /// - If `T` has stricter alignment requirements than `cmsghdr`
     pub fn push<T: Copy + ?Sized>(&mut self, level: libc::c_int, ty: libc::c_int, value: T) {
+        /*
         assert!(mem::align_of::<T>() <= mem::align_of::<libc::cmsghdr>());
         let space = unsafe { libc::CMSG_SPACE(mem::size_of_val(&value) as _) as usize };
         #[allow(clippy::unnecessary_cast)] // hdr.msg_controllen defined as size_t
@@ -53,6 +54,7 @@ impl<'a> Encoder<'a> {
         }
         self.len += space;
         self.cmsg = unsafe { libc::CMSG_NXTHDR(self.hdr, cmsg).as_mut() };
+        */
     }

     /// Finishes appending control messages to the buffer
link2xt commented 1 year ago

I am running kernel 3.10.108 on Android. Looks like it does not support any cmsg_type except for IP_RETOPTS and IP_PKTINFO: https://elixir.bootlin.com/linux/v3.10.108/source/net/ipv4/ip_sockglue.c#L219

IP_TOS is supported on later kernels: https://elixir.bootlin.com/linux/v6.2.6/source/net/ipv4/ip_sockglue.c#L308

It was introduced in this commit: https://github.com/torvalds/linux/commit/f02db315b8d888570cb0d4496cfbb7e4acb047cb

For unknown cmsg types Linux returns EINVAL and does not send anything at all.

I suggest disabling adding IP_TOS cmsg for Linux < 3.13 or all Androids.

link2xt commented 1 year ago

I made a fix: ~#1512~ #1516

dignifiedquire commented 1 year ago

This can be closed, given #1516 was merged