Closed jaysonsantos closed 1 year ago
This looks like a reasonable fix, but it's unclear how systemd
deals with this issue. Because from what I can tell SD_LISTEN_FDS_START
is hardcoded to 3 there. Are they preventing programs from opening handles before fork?
I saw that on the python library that it also fixes with 3 but, I didn't go further to check what's happening there. I'd assume that systemd would work only with the first process being the target activated by the socket and not chained
What's funny is that I can only reproduce this on macos and on linux it does not happen.
❯ /bin/cat Makefile
a:
systemfd -s http::9890 -- python -c 'import socket; socket.fromfd(3, socket.AF_INET, socket.SOCK_STREAM)'
❯ make a -j 2
systemfd -s http::9890 -- python -c 'import socket; socket.fromfd(3, socket.AF_INET, socket.SOCK_STREAM)'
~> socket http://127.0.0.1:9890/ -> fd #6
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/jayson.reis/.asdf/installs/python/3.10.9/lib/python3.10/socket.py", line 546, in fromfd
return socket(family, type, proto, nfd)
File "/Users/jayson.reis/.asdf/installs/python/3.10.9/lib/python3.10/socket.py", line 232, in __init__
_socket.socket.__init__(self, family, type, proto, fileno)
OSError: [Errno 38] Socket operation on non-socket
make: *** [a] Error 1
the funny thing that on macos it only happens on the native make GNU Make 3.81
but not with gmake from homebrew GNU Make 4.4
Now you set LISTEN_FDS_FIRST_FD, but I see nothing even attempting to make the FDs sequential. For example: if libc happens to have fd 7 still open, you can end up with 6, 8, 9. You're going to have trouble all the way until you do it right, with e.g. https://doc.rust-lang.org/std/os/unix/process/trait.CommandExt.html#tymethod.pre_exec
Hey @tv42 sorry but, I didn't get what you mean, could you rephrase or show some examples?
if libc happens to have fd 7 still open, you can end up with 6, 8, 9
Which is not the case here because the cli will open the file descriptors in a sequence, and it already inherited any fds that came from parent processes plus its own 3 for stdio, isn't it?
You're going to have trouble all the way until you do it right...
How so, do you have examples like the simple make + python script to show what you mean?
libc opens file A, gets fd 6. libc opens file B, gets fd 7. libc closes fd 6.
your code is called. opens socket X, gets fd 6.opens socket Y, gets fd 8.
you exec the subprocess, telling it LISTEN_FDS_FIRST_FD=6, LISTEN_FDS=2.
the subprocess expects to receive fds 6 & 7, but when it inherits fds from you, your sockets are 6 & 8, 7 being a libc-internal thing that's likely CLOEXEC and not inherited.
Also, LISTEN_FDS_FIRST_FD seems completely non-standard?
It is nonstandard. I'm not sure what a good way to resolve this is though because the systemd protocol assumes that the first FD is 3, so unclear what can be done here other than to close down all fds on exec, which clearly does not happen on macos in some cases.
I think I remember that systemfd forks+execs, doesn't exec. For that, you need a handler between fork and exec that arranges the file descriptors appropriately.
As is, people should just use systemd-socket-activate
instead.
@mitsuhiko as I checked on systemd code, it forks then reorder the fds before execing but, as I can see systemfd is not meant to be a substitute for systemd tools and it works quite good on macos, i don't see the point of going all the way to replicate what systemd does on linux if it is already there for that
For further reference, here is where I found that code https://github.com/systemd/systemd/blob/363ed187309ddc5712a7f1a4959c346fc9659621/src/basic/fd-util.c#L632-L742
Maybe the correct thing to do is to just close all fds above 2.
When running on macos, if you have any command that opens file descriptors before systemfd, the following error would happen
One example is
make -j 2 target
as it opens files for its own usage.This fixes it by creating a variable pointing to the first fd created so listenfd can start from there, instead of the fixed value of 3.
This should fix #13 and #4