saleyn / erlexec

Execute and control OS processes from Erlang/OTP
https://hexdocs.pm/erlexec/readme.html
Other
525 stars 139 forks source link

erlexec on linux slows down when nofile limit is a large value #150

Closed sigsergv closed 2 years ago

sigsergv commented 2 years ago

When you set ulimit value nofile to a large number like 600000 erlexec on each process execution tries to close 600000 descriptors and this increase execution time a lot for each run call.

To reproduce edit file /etc/security/limits.conf and set nofile limit to 600000 then relogin and try something like this:

erlexec-1.6.4$ ulimit -n
600000
erlexec-1.6.4$ erl -pa ebin/
Erlang/OTP 24 [erts-10.5] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1]

Eshell V10.5  (abort with ^G)
1> application:start(erlexec), timer:tc(exec, run, [["/bin/ls"], [sync]]).
{43908,{ok,[]}}
^C
erlexec-1.6.4$ ulimit -n 1024
erlexec-1.6.4$ erl -pa ebin/
Erlang/OTP 24 [erts-10.5] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1]

Eshell V10.5  (abort with ^G)
1> application:start(erlexec), timer:tc(exec, run, [["/bin/ls"], [sync]]).
{1719,{ok,[]}}

On systems under heavy load each execution of exec:run could take a few seconds.

In exec_impl.cpp:

        for(int i=STDERR_FILENO+1; i < max_fds; i++)
            close(i);
saleyn commented 2 years ago

Looks like in the parent process we need to maintain an std::set of open file descriptors, and after the fork close them using the iterator. I'll put this on my to-do list, but not sure yet when I'll have time to work on this fix. Patches are welcome.

It would be a better solution to set FD_CLOEXEC on the file descriptors above 3, to eliminate the need in that loop all together, but FD_CLOEXEC only works with exec(2) and doesn't work with fork(2). However, since after fork there's a call to execve(2), this approach would work.

saleyn commented 2 years ago

I believe this issue is fixed in the latest commit.