erlang / otp

Erlang/OTP
http://erlang.org
Apache License 2.0
11.42k stars 2.95k forks source link

gen_udp with Unix domain socket on Linux can block, leaking inet_reply messages into calling process #8989

Open mjm opened 1 month ago

mjm commented 1 month ago

Describe the bug

When using the gen_udp module with Unix domain sockets, sending packets can return an EINTR error, which seems to be unexpected by the sendto implementation in inet, as it responds with an {inet_reply, Port, Ref} message (no reply value) that goes unhandled by sendto and ends up in the calling process's mailbox.

To Reproduce

I've reduced this to a small reproduction in Elixir: https://gist.github.com/mjm/490abd286e526fceaeb0e373414e1214

It reproduces for me on Linux but not on macOS, so I used docker run -it elixir /bin/bash to get a Linux Elixir environment. Then you can paste the module in the gist into two iex sessions, and run UdsBlockExample.test_listen() in one, and UdsBlockExample.test_socket() in the other.

test_socket() will raise an error that it received an unexpected inet_reply message.

Expected behavior

This example code should run without error, as inet_reply messages should not leak out of these calls.

In production, this is manifesting as some of our genservers suddenly receiving these unexpected messages after we switched to using Unix domain sockets for reporting telemetry to statsd.

Using the new socket inet_backend also causes this to work as expected.

Affected versions

In production we hit this on OTP 26.2.5 but it also reproduces on the latest OTP 27.

Additional context

The undesired messages come from this code path in the inet driver.

A comment a short bit above this suggests that EINTR should not happen for UDP, and that seems to be true, but it appears that it can happen for AF_UNIX datagram sockets, at least on Linux.

And here is where sendto is not handling this shape of message, which is what allows it to leak. The implementation of send above this has a case for handling 3-tuples, but sendto assumes that won't happen.

bmk commented 4 weeks ago

There is a comment in the code that explains why gen_udp has problems with this:

`/* "code" analysis is the same for both SCTP and UDP above,

So, EINTR is "not supposed" to be possible. Clearly, when on Unix Domain Socket, this can happen (on Liinux)...

bmk commented 4 weeks ago

Should have asked this before, but what flavor and version of Linux did you test this with?

frej commented 3 weeks ago

So, EINTR is "not supposed" to be possible.

EINTR is documented as a valid error for all of send, sendto and sendmsg if you get a signal, so the comment is wrong. Unless the vm traps it using a signalfd, that is :)

bmk commented 3 weeks ago

I mentioned the comment as an explanation of the behavior, not a justification. Regardless, I have done some testing:

On FreeBSD (14.1), OpenIndiana (Hipster 2023.10), MacOS (14.4.1/23.4.0), NetBSD (9.0) the result is 'enoent'.

I have also tested this on the following versions of Linux without being able to reproduce the issue: Ubuntu 22.04.5 (6.8.0-47-generic), Ubuntu 20.04.6 (5.4.0-196-generic), Linux Mint 21 (5.15.0-122-generic), LMDE 5 (5.10.0-33-amd64), SLES 12 (3.12.60-52.54-default), SLES 12-SP2 (4.4.74-92.35-default).

Here is a PR for testing: https://github.com/bmk/otp/tree/bmk/kernel/20241030/gen_udp_blocking_send_on_local

mjm commented 3 weeks ago

Should have asked this before, but what flavor and version of Linux did you test this with?

In production, we're running on Google Kubernetes Engine, so the nodes are running Container-Optimized OS cos-113-18244-151-27. When I was creating the reproduction example, I was running on Docker Desktop on macOS 4.34.3 (170107). I'm not sure what version of Linux that's using on the VM it manages.

In both contexts, sysctl net.unix.max_dgram_qlen appears to be 10. I think it being so low is why this happens.

bmk commented 3 weeks ago

Aha. On my machine: $ sysctl net.unix.max_dgram_qlen net.unix.max_dgram_qlen = 512

If you can, please test my branch, and see if that solves the problem.

mjm commented 3 weeks ago

Okay, today I'll see if I can get that built today in a context where I've actually had the problem.

mjm commented 3 weeks ago

I was able to build your branch in a Docker container and test it alongside both 27.1.2 and 25.3.2.15. The former reproduces the bug, while the latter does not because the logic for handling EINTR special doesn't exist yet in that version.

Your branch did not reproduce the problem!