mitsuhiko / systemfd

Development helper utility for helping with auto reloading for projects supporting systemd socket activation
Apache License 2.0
468 stars 17 forks source link

fix: Fix error fd 3 is not a socket on macos #17

Closed jaysonsantos closed 1 year ago

jaysonsantos commented 1 year ago

When running on macos, if you have any command that opens file descriptors before systemfd, the following error would happen

fd 3 is not a socket

One example is make -j 2 target as it opens files for its own usage.

This fixes it by creating a variable pointing to the first fd created so listenfd can start from there, instead of the fixed value of 3.

This should fix #13 and #4

mitsuhiko commented 1 year ago

This looks like a reasonable fix, but it's unclear how systemd deals with this issue. Because from what I can tell SD_LISTEN_FDS_START is hardcoded to 3 there. Are they preventing programs from opening handles before fork?

jaysonsantos commented 1 year ago

I saw that on the python library that it also fixes with 3 but, I didn't go further to check what's happening there. I'd assume that systemd would work only with the first process being the target activated by the socket and not chained

jaysonsantos commented 1 year ago

What's funny is that I can only reproduce this on macos and on linux it does not happen.

❯ /bin/cat Makefile
a:
    systemfd -s http::9890 -- python -c 'import socket; socket.fromfd(3, socket.AF_INET, socket.SOCK_STREAM)'

❯ make a -j 2
systemfd -s http::9890 -- python -c 'import socket; socket.fromfd(3, socket.AF_INET, socket.SOCK_STREAM)'
~> socket http://127.0.0.1:9890/ -> fd #6
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/jayson.reis/.asdf/installs/python/3.10.9/lib/python3.10/socket.py", line 546, in fromfd
    return socket(family, type, proto, nfd)
  File "/Users/jayson.reis/.asdf/installs/python/3.10.9/lib/python3.10/socket.py", line 232, in __init__
    _socket.socket.__init__(self, family, type, proto, fileno)
OSError: [Errno 38] Socket operation on non-socket
make: *** [a] Error 1

the funny thing that on macos it only happens on the native make GNU Make 3.81 but not with gmake from homebrew GNU Make 4.4

tv42 commented 1 year ago

Now you set LISTEN_FDS_FIRST_FD, but I see nothing even attempting to make the FDs sequential. For example: if libc happens to have fd 7 still open, you can end up with 6, 8, 9. You're going to have trouble all the way until you do it right, with e.g. https://doc.rust-lang.org/std/os/unix/process/trait.CommandExt.html#tymethod.pre_exec

jaysonsantos commented 1 year ago

Hey @tv42 sorry but, I didn't get what you mean, could you rephrase or show some examples?

if libc happens to have fd 7 still open, you can end up with 6, 8, 9

Which is not the case here because the cli will open the file descriptors in a sequence, and it already inherited any fds that came from parent processes plus its own 3 for stdio, isn't it?

You're going to have trouble all the way until you do it right...

How so, do you have examples like the simple make + python script to show what you mean?

tv42 commented 1 year ago

libc opens file A, gets fd 6. libc opens file B, gets fd 7. libc closes fd 6.

your code is called. opens socket X, gets fd 6.opens socket Y, gets fd 8.

you exec the subprocess, telling it LISTEN_FDS_FIRST_FD=6, LISTEN_FDS=2.

the subprocess expects to receive fds 6 & 7, but when it inherits fds from you, your sockets are 6 & 8, 7 being a libc-internal thing that's likely CLOEXEC and not inherited.

Also, LISTEN_FDS_FIRST_FD seems completely non-standard?

mitsuhiko commented 1 year ago

It is nonstandard. I'm not sure what a good way to resolve this is though because the systemd protocol assumes that the first FD is 3, so unclear what can be done here other than to close down all fds on exec, which clearly does not happen on macos in some cases.

tv42 commented 1 year ago

I think I remember that systemfd forks+execs, doesn't exec. For that, you need a handler between fork and exec that arranges the file descriptors appropriately.

As is, people should just use systemd-socket-activate instead.

jaysonsantos commented 1 year ago

@mitsuhiko as I checked on systemd code, it forks then reorder the fds before execing but, as I can see systemfd is not meant to be a substitute for systemd tools and it works quite good on macos, i don't see the point of going all the way to replicate what systemd does on linux if it is already there for that

jaysonsantos commented 1 year ago

For further reference, here is where I found that code https://github.com/systemd/systemd/blob/363ed187309ddc5712a7f1a4959c346fc9659621/src/basic/fd-util.c#L632-L742

mitsuhiko commented 1 year ago

Maybe the correct thing to do is to just close all fds above 2.