msantos / epcap

Erlang packet capture interface using pcap
http://listincomprehension.com/2009/12/erlang-packet-sniffer-using-ei-and.html
BSD 3-Clause "New" or "Revised" License
178 stars 56 forks source link

rlimit on Ubuntu 22.04 #33

Closed jephthai closed 1 year ago

jephthai commented 1 year ago

I deployed a system built on Ubuntu 20.04, and everything worked great. After building a second unit, using Ubuntu 22.04, I have run afoul of the setrlimit() behavior in restrict_process_rlimit.c. By setting the RLIMIT to 1, I get an error message can't poll on packet socket: Invalid argument when epcap goes into the pcap_loop() call.

The values for rlim_cur and rlim_max are 1024 and 1048576 before epcap decides to set them to 1.

I'm not sure yet if this is a libpcap issue -- perhaps the version of libpcap that ships with Ubuntu 22.04 does something with the FD that violates the RLIMIT_NOFILES rules. This is a Linux internals area that I'm not very familiar with.

Anyway, it seems I can't work around this without a code change -- I either have to define EPCAP_RLIMIT_NOFILES somehow at compile time to a value higher than 1, or remove the call to restrict_process_capture() in epcap.c.

jephthai commented 1 year ago

As a followup, I thought it would be good to clarify that I'm running this library with elixir, and using mix to handle downloading and compiling dependencies. I've found a decent workaround for the time being when I go to install dependencies using this command line:

$ EPCAP_RLIMIT_NOFILES=1024 mix deps.compile

And when I go to deploy for production, I can set this value as follows:

$ MIX_ENV=prod EPCAP_RLIMIT_NOFILES=1024 mix release
msantos commented 1 year ago

@jephthai thanks for letting me know and sorry you ran into this issue!

Another workaround is to disable the process restrictions. Following your example:

$ RESTRICT_PROCESS=null mix deps.compile
$ MIX_ENV=prod  RESTRICT_PROCESS=null mix release

I was able to reproduce the issue on a Ubuntu 22.04 system. The tests were passing for me because I have RESTRICT_PROCESS=seccomp set in the environment. The seccomp restrictions can be brittle between OS releases and so are disabled by default.

Looking at the system calls, as you surmised, libpcap opens 5 fd's now:

$ strace _build/default/lib/epcap/priv/epcap -i ens4
...
openat(AT_FDCWD, "/dev/null", O_RDWR)   = 3
getuid()                                = 1001
getgid()                                = 1002
socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC) = 4
ioctl(4, SIOCGIFNAME, {ifr_ifindex=0})  = -1 ENODEV (No such device)
ioctl(4, SIOCETHTOOL, 0x7fffc4288410)   = 0
close(4)                                = 0
eventfd2(0, EFD_NONBLOCK)               = 4
socket(AF_PACKET, SOCK_RAW, htons(0 /* ETH_P_??? */)) = -1 EPERM (Operation not permitted)
close(4)                                = 0

The highest fd can be found at runtime by reading /dev/fd. I will take a look. I may also disable the process restrictions on linux by default:

diff --git a/c_src/Makefile b/c_src/Makefile
index 3a7f0fd..f7f77aa 100644
--- a/c_src/Makefile
+++ b/c_src/Makefile
@@ -49,7 +49,7 @@ else ifeq ($(UNAME_SYS), SunOS)
        CFLAGS += -std=c99 -D_POSIX_C_SOURCE=200112L -D__EXTENSIONS__=1
 endif

-RESTRICT_PROCESS ?= rlimit
+RESTRICT_PROCESS ?= null
 EPCAP_RLIMIT_NOFILES ?= 1

 EPCAP_CFLAGS ?= -g -Wall -fwrapv