Closed wngr closed 3 years ago
Is the x86 thing an emulator? From the strace output it would seem that an entirely wrong syscall is being invoked.
Can this issue reproduce if you call accept4
from C? If so, consider reporting a bug against Android as well.
Is the x86 thing an emulator?
Happens both on an Emulator and a real x86 device (Zebra ET50).
I will try to whip up a minimal reproducer using the socket4 API.
Edit:
use std::mem;
use std::net::SocketAddr;
fn main() {
unsafe {
let bind_to: SocketAddr = "0.0.0.0:8080".parse().unwrap();
println!("Trying to bind to {}", bind_to);
let sock_addr =
nix::sys::socket::SockAddr::new_inet(nix::sys::socket::InetAddr::from_std(&bind_to));
let fd = libc::socket(libc::AF_INET, libc::SOCK_STREAM | libc::SOCK_CLOEXEC, 0);
let (addrp, len) = sock_addr.as_ffi_pair();
assert_eq!(libc::bind(fd, addrp, len as _), 0);
assert_eq!(libc::listen(fd, 128), 0);
let mut storage: libc::sockaddr_storage = mem::zeroed();
let mut len = mem::size_of_val(&storage) as libc::socklen_t;
if libc::accept4(
fd,
&mut storage as *mut _ as *mut _,
&mut len,
libc::SOCK_CLOEXEC,
) < 0
{
println!("error on accept: {}", nix::errno::errno())
} else {
println!("read {} from socket", len);
}
}
}
This works fine on my linux machine, but running it on an Android API 23 x86 emulator yields:
root@generic_x86:/data/local # strace ./android-socket
[..]
write(1, "Trying to bind to 0.0.0.0:8080\n", 31Trying to bind to 0.0.0.0:8080
) = 31
socket(PF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(3, 128) = 0
syscall_364(0x3, 0xbf85b6c0, 0xbf85b698, 0x80000, 0x80, 0x10) = -1 (errno 38)
write(1, "error on accept: 38\n", 20error on accept: 38
) = 20
futex(0xb771722c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
mprotect(0xb771a000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0xb771a000, 4096, PROT_READ) = 0
close(0) = 0
close(1) = 0
close(2) = 0
futex(0xb770f544, FUTEX_WAKE_PRIVATE, 2147483647) = 0
mprotect(0xb771a000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0xb771a000, 4096, PROT_READ) = 0
mprotect(0xb771a000, 4096, PROT_READ|PROT_WRITE) = 0
mprotect(0xb771a000, 4096, PROT_READ) = 0
munmap(0xb771a000, 4096) = 0
exit_group(0) = ?
+++ exited with 0 +++
This also fails when using the generic syscall with errno 38:
libc::syscall(
libc::SYS_accept4,
fd,
&mut storage as *mut _ as *mut _,
&mut len,
libc::SOCK_CLOEXEC,
)
Seems my guess on why that started happening is wrong; now I wonder why that ever worked .. :confused:
Just for reference:
cross build --target i686-linux-android --release && adb push target/i686-linux-android/release/android-socket /data/local
@de-vri-es and @rustbot ping libs
any insights to share about this?
Error: This team (libs
) cannot be pinged via this command; it may need to be added to triagebot.toml
on the master branch.
Please let @rust-lang/release
know if you're having trouble with this bot.
Hmm, maybe this has something to do with it: https://android.googlesource.com/platform/bionic.git/+/master/libc/SYSCALLS.TXT#264
# sockets for x86. These are done as an "indexed" call to socketcall syscall.
<snip>
int __accept4:socketcall:18(int, struct sockaddr*, socklen_t*, int) x86
<snip>
I've never heard of the "socketcall syscall" before, but apparently, on x86 Linux (before 4.3), you have to call socketcall(SYS_ACCEPT4, &args)
.
A bit annoying, because there is no documentation on what args should be. It "points to a block containing the actual arguments". The Linux man page specifically says this:
User programs should call the appropriate functions by their usual names. Only standard library implementors and kernel hackers need to know about socketcall().
Linux 4.3 also added the regular syscalls for x86, but that's not something we can rely on for the android target :(
So I think there are two options:
1) Drop accept4 on Android and accept the race condition on the CLOEXEC flag.
2) Figure out how to use socketcall
correctly for x86 Android in libc
.
Linux 4.3 also added the regular syscalls for x86, but that's not something we can rely on for the android target :(
Note: although the same weird syscall is required for Linux < 4.3 (which is still a tier 1 platform), it shouldn't affect the Linux target because there rust libc is just calling accept4
. from the real libc.
It would be nice if Android's seccomp filter allowed the direct syscalls on x86 too, not just on arm; that'd be worth trying to get added to Android.
But in the meantime, it looks like Android targets on 32-bit x86 will need to invoke socketcall
for all socket syscalls, rather than making the syscalls directly.
It would be nice if Android's seccomp filter allowed the direct syscalls on x86 too, not just on arm; that'd be worth trying to get added to Android.
It may not be the seccomp filter though. Older x86 kernels really do not have the accept4 syscall.
Anyway, I'm working on a PR for libc to switch to socketcall
on x86 android.
PR opened: https://github.com/rust-lang/libc/pull/2079
thank you @de-vri-es and @joshtriplett
@rustbot label -I-prioritize -E-needs-bisection
Note: back when accept4 was used on more platforms, std couldn't update to the latest libc. So it currently has the same workaround. I submitted https://github.com/rust-lang/rust/pull/82473 to switch to libc::accept4
so std
would also receive the fix (after a libc bump).
This isn't actually fixed yet, my bad for accidentally linking the issue. Might be good to re-open until it really is solved in std too.
@de-vri-es no worries, I'll reopen it
Lmao, github needs to come up with better algorithm to decide when to close issues with keywords. @JohnTitor could you reopen this issue ?
Since the libc
PR is merged, and std
now uses libc::accept4
, all that is left to fix this on the main branch is to have a libc
version bump (and a matching bump of the libc
dependency for std
).
Is that going to be on time for 1.49.0, or should someone be pinged to ask for a libc
release?
@de-vri-es Released libc 0.2.87, you can now update the libc
dependency :)
Thanks! Opened https://github.com/rust-lang/rust/pull/82731 :)
Starting with Android Oreo (8), Android started using a seccomp based filter approach to syscalls, explicitly allowing syscalls, see https://android-developers.googleblog.com/2017/07/seccomp-filter-in-android-o.html. https://android.googlesource.com/platform/bionic.git/+/master/libc/SYSCALLS.TXT enumerates the allowed syscalls. This means, that dispatching a generic syscall mechanism as introduced with this PR #78572 will result in a panic.
On top of that, I found that older versions of Android, such as Android 6, will return
Function not implemented (os error 38)
for this syscall. My tests showed that this only happens on x86, although I can't explain why.As there's no way to detect the target android API level, the safest way would probably be to always use the
accept
syscall, and remove the special handling of theaccept4
syscall. This could also be just guarded for x86 -- but would be happy to get an explanation on why it's only on that architecture (see https://github.com/tokio-rs/mio/pull/1446).I have tested this with both real devices as well as Android emulators.
Code
I tried this code:
and poked it with
telnet <android_host> 8080
.I expected to see this happen:
accept
returnsOk(_)
Instead, this happened:
[pid 10918] syscall_364(0x34, 0x9d5c9cf8, 0x9d5c9ca0, 0x80800, 0x9fda9dc8, 0x9fda9dc8 <unfinished ...> [pid 10918] <... syscall_364 resumed> ) = -1 (errno 38)
rustc --version --verbose rustc 1.49.0 (e1884a8e3 2020-12-29) binary: rustc commit-hash: e1884a8e3c3e813aada8254edfa120e85bf5ffca commit-date: 2020-12-29 host: x86_64-unknown-linux-gnu release: 1.49.0