Open dm9bbadd4 opened 4 months ago
I don´t think changing the limit is the way to go: sslh uses 2 descriptors per connection, and as you say on a personal setup the limits should be high enough. I would rather look at why are these connections open: maybe there is a firewall somewhere that prevents proper closure?
Check what is opened with lsof | grep sslh
(for reference, my own personal setup has about 100 descriptors open)
As in firewall running on the system? The only one I used is ufw. My setup was working fine for ages and then updating to Ubuntu LTS 24 just caused a whole world of issues with sslh in particular I ran the command and there's only 33 open files right now
Ok so it's... Not too many open files :) Can you post the entire error message? I would expect it to contain the file and line number that generated the error.
16 juil. 2024 14:44:14 dm9bbadd4 @.***>:
As in firewall running on the system? The only one I used is ufw. I ran the command and there's only 33 open files right now
lsof | grep sslh sslh 226566 sslh rtd DIR 8,34 4096 2 / sslh 226566 sslh txt REG 8,34 549656 6423611 /usr/sbin/sslh sslh 226566 sslh mem REG 8,34 149760 6423674 /usr/lib/x86_64-linux-gnu/libgpg-error.so.0.34.0 sslh 226566 sslh mem REG 8,34 1340976 6425542 /usr/lib/x86_64-linux-gnu/libgcrypt.so.20.4.3 sslh 226566 sslh mem REG 8,34 2125328 6423108 /usr/lib/x86_64-linux-gnu/libc.so.6 sslh 226566 sslh mem REG 8,34 755864 6428384 /usr/lib/x86_64-linux-gnu/libzstd.so.1.5.5 sslh 226566 sslh mem REG 8,34 202904 6428531 /usr/lib/x86_64-linux-gnu/liblzma.so.5.4.5 sslh 226566 sslh mem REG 8,34 137440 6424269 /usr/lib/x86_64-linux-gnu/liblz4.so.1.9.4 sslh 226566 sslh mem REG 8,34 67584 6427452 /usr/lib/x86_64-linux-gnu/libev.so.4.0.0 sslh 226566 sslh mem REG 8,34 910592 6423085 /usr/lib/x86_64-linux-gnu/libsystemd.so.0.38.0 sslh 226566 sslh mem REG 8,34 51536 6428380 /usr/lib/x86_64-linux-gnu/libcap.so.2.66 sslh 226566 sslh mem REG 8,34 51584 6434557 /usr/lib/x86_64-linux-gnu/libconfig.so.9.2.0 sslh 226566 sslh mem REG 8,34 44064 6423398 /usr/lib/x86_64-linux-gnu/libwrap.so.0.7.6 sslh 226566 sslh mem REG 8,34 625344 6430550 /usr/lib/x86_64-linux-gnu/libpcre2-8.so.0.11.2 sslh 226566 sslh mem REG 8,34 236616 6423087 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 sslh 226566 sslh 0r CHR 1,3 0t0 5 /dev/null sslh 226566 sslh 1u unix 0xffff9f384bc8e000 0t0 228018983 type=STREAM (CONNECTED) sslh 226566 sslh 2u unix 0xffff9f384bc8e000 0t0 228018983 type=STREAM (CONNECTED) sslh 226566 sslh 3u IPv4 227718734 0t0 TCP :https (LISTEN) sslh 226566 sslh 4u IPv4 227718737 0t0 UDP :https sslh 226566 sslh 5u a_inode 0,15 0 1073 [eventpoll:3,4,6,8,9,221,492,493,495,527,625,632] sslh 226566 sslh 6u a_inode 0,15 0 1073 [eventfd:13] sslh 226566 sslh 7u IPv4 232477516 0t0 UDP :35176 sslh 226566 sslh 8u IPv4 232873671 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) sslh 226566 sslh 9u IPv4 232873674 0t0 TCP localhost:39574->localhost:441 (ESTABLISHED) sslh 226566 sslh 221u IPv4 230410684 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) sslh 226566 sslh 492u IPv4 230415619 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) sslh 226566 sslh 493u IPv4 230414994 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) sslh 226566 sslh 495u IPv4 230415626 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) sslh 226566 sslh 527u IPv4 230415764 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) sslh 226566 sslh 625u IPv4 230561327 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) sslh 226566 sslh 632u IPv4 230560766 0t0 TCP 192.168.0.7:https->[removed] (ESTABLISHED) * — Reply to this email directly, view it on GitHub[https://github.com/yrutschle/sslh/issues/456#issuecomment-2230798379], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ABGAU77HV5N2YN3BHM4WU5LZMUIRTAVCNFSM6AAAAABK4JWU66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZQG44TQMZXHE]. You are receiving this because you commented. [Image de pistage][https://github.com/notifications/beacon/ABGAU77B3L7GUZSFY6BT35TZMUIRTA5CNFSM6AAAAABK4JWU66WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUE65ECW.gif]
Next time I get the error I'll post it here. Seems to happen every few days
Error happened again. I don't know how helpful these logs will be.
Can you go back a bit further in the strace to see what are the last file descriptors open?
Either way it doesn't look quite right. Is it possible an unrelated process is eating all the fds? (Check with lsof without the grep)
19 juil. 2024 14:08:06 dm9bbadd4 @.***>:
Error happened again. I don't know how helpful these logs will be.
lsof | grep sslh sslh 226566 sslh 491u IPv4 235984574 0t0 TCP 192.168.0.7:https->[removed]:54539 (CLOSE_WAIT) sslh 226566 sslh 492u IPv4 230415619 0t0 TCP 192.168.0.7:https->[removed]:60759 (ESTABLISHED) sslh 226566 sslh 493u IPv4 230414994 0t0 TCP 192.168.0.7:https->[removed]:55735 (ESTABLISHED) sslh 226566 sslh 494u IPv4 235984585 0t0 TCP 192.168.0.7:https->[removed]:55273 (CLOSE_WAIT) sslh 226566 sslh 495u IPv4 230415626 0t0 TCP 192.168.0.7:https->[removed]:38757 (ESTABLISHED) sslh 226566 sslh 496u IPv4 235984596 0t0 TCP 192.168.0.7:https->[removed]:54639 (CLOSE_WAIT) sslh 226566 sslh 497u IPv4 235984607 0t0 TCP 192.168.0.7:https->[removed]:56965 (CLOSE_WAIT) sslh 226566 sslh 498u IPv4 235984613 0t0 TCP 192.168.0.7:https->[removed]:38565 (CLOSE_WAIT) sslh 226566 sslh 499u IPv4 235984629 0t0 TCP 192.168.0.7:https->[removed]:12335 (CLOSE_WAIT) sslh 226566 sslh 500u IPv4 235984645 0t0 TCP 192.168.0.7:https->[removed]:41167 (CLOSE_WAIT) sslh 226566 sslh 501u IPv4 235986652 0t0 TCP 192.168.0.7:https->[removed]:24169 (CLOSE_WAIT) sslh 226566 sslh 502u IPv4 235987051 0t0 TCP 192.168.0.7:https->[removed]:64383 (CLOSE_WAIT) sslh 226566 sslh 503u IPv4 235987069 0t0 TCP 192.168.0.7:https->[removed]:57275 (CLOSE_WAIT) strace write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files) epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1 accept(3, NULL, NULL) = -1 EMFILE (Too many open files) write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files) epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1 accept(3, NULL, NULL) = -1 EMFILE (Too many open files) write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files) epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1 accept(3, NULL, NULL) = -1 EMFILE (Too many open files) write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files) epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1 accept(3, NULL, NULL) = -1 EMFILE (Too many open files) write(2, "tcp-listener.c:131:accept:24:Too"..., 49) = 49 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = -1 EMFILE (Too many open files) epoll_wait(5, [{events=EPOLLIN, data={u32=3, u64=167503724547}}], 1704, 59743) = 1 ^Caccept(3, NULL, NULLstrace: Process 226566 detached <detached ...> — Reply to this email directly, view it on GitHub[https://github.com/yrutschle/sslh/issues/456#issuecomment-2239001791], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ABGAU7YTEFJUPOFBST245B3ZND6SFAVCNFSM6AAAAABK4JWU66VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZZGAYDCNZZGE]. You are receiving this because you commented. [Image de pistage][https://github.com/notifications/beacon/ABGAU75C7C6HMR24R2LIUMLZND6SFA5CNFSM6AAAAABK4JWU66WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUFOR2L6.gif]
How does your configuration look like? How many targets? Some redirects? Just asking from curiosity, as I run in this message during a test, but there a message like that was to expect.
My configuration looked like: sslh:1->nginx:2->sslh:3->nginx:4->sslh:5->nginx:6.... up to nginx:40 as final destination. And than I throw around 25 parallel connections on that construct.
I was trying to run strace writing to a log file until it errored but stopped writing after a time. My config is pretty small, one udp and one tcp port (443) being listened on and only 4 redirects with one of them being to nginx. Again this issue only started after I updated to Ubuntu server 24 LTS.
Have a look at #468
Can you tell me details of the system, how you have compiled sslh?
Give me the output of ls -lad /proc/XXXXX/fd/*
, where XXXX is the PID of the leading sslh process.
Furthermore, if you have self-compiled sslh the output of gcc --version
When running ls -lad /proc/[SSLH-PID]/fd/*
I get ls: cannot access '/proc/393573/fd/*': No such file or directory
whether the too many open files error is happening or not.
The version of SSLH I am using is sslh-ev head-2024-07-08
and the make file options are
ENABLE_SANITIZER= # Enable ASAN/LSAN/UBSAN
ENABLE_REGEX=1 # Enable regex probes
USELIBCONFIG=1 # Use libconfig? (necessary to use configuration files)
USELIBWRAP?=1 # Use libwrap?
USELIBCAP=1 # Use libcap?
USESYSTEMD=1 # Make use of systemd socket activation
USELIBBSD?= # Use libbsd (needed to update process name in `ps`)
COV_TEST= # Perform test coverage?
PREFIX?=/usr
BINDIR?=$(PREFIX)/sbin
MANDIR?=$(PREFIX)/share/man/man8
gcc --version
output gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0
Can you double-check and do the ls as root/sudo? There MUST be filehandles available for each running process. I figured out, that there is some -right now not identifies- issue, with newer compile chains, that each running sslh process has six additional filehandles open. You already answered my question with the sslh version, as this shows, that you are running a self-compiled sslh. When I compiled it under ubuntu 24.04 I had this issue. I bet, you will see those filehandles (maybe with other fd-ids), when doing the ls right.
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/6 -> /lib
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/7 -> /usr/lib
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/8 -> /etc/ld.so.cache
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/9 -> /etc/hosts
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/10 -> /run/resolvconf/resolv.conf
l--------- 1 root root 64 24. Aug 14:42 /proc/14290/fd/11 -> /etc/nsswitch.conf
And that could explain your issue, when sslh has instead of 4 or 5, 10 or 11 handles open! I try to dig into the issue, but currently I am clueless. It must have something todo with dns-lookup.
If you see those handles, a possible workaround could be, compiling sslh on a system with an older gcc-chain. For me it worked for example under debian bullseye with gcc (Debian 10.2.1-6) 10.2.1 20210110
Also ok: Compiling under Ubuntu 22.04 (5.15.0-119-generic
) with gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Running sudo ls -lad /proc/[SSLH-PID]/fd/*
still gives the same error. Running sudo ls -la /proc/1285/fd/
shows:
total 0
dr-x------ 2 sslh sslh 10 Aug 24 23:04 .
dr-xr-xr-x 9 sslh sslh 0 Aug 24 23:04 ..
lr-x------ 1 sslh sslh 64 Aug 24 23:04 0 -> /dev/null
lrwx------ 1 sslh sslh 64 Aug 24 23:04 1 -> 'socket:[13404]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 2 -> 'socket:[13404]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 3 -> 'socket:[10883]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 4 -> 'socket:[13568]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 5 -> 'anon_inode:[eventpoll]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 6 -> 'anon_inode:[eventfd]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 7 -> 'socket:[1763451]'
lrwx------ 1 sslh sslh 64 Aug 25 16:57 8 -> 'socket:[1763454]'
lrwx------ 1 sslh sslh 64 Aug 24 23:04 9 -> 'socket:[1460918]'
I don't have another system I could compile this on unfortunately. If you were to share your compiled version that might help but I don't know if that's allowed.
I found the issue for the filehandle problem I detected. That was an side effect of landlock, so unfortunately not your problem. I only remembered your issue. And as you are using the -ev version, this effect would have impacted you by far not as strong, as the -fork users where impacted. So sorry for you, but it was worth also digging in here.
If anyone else is experiencing this issue, my workaround is to restart sslh everyday but editing the service file (ubuntu) and setting
[Service]
Restart=always
RuntimeMaxSec=1d
This is a new issue I've got recently after updating to Ubuntu 24. After running SSLH for just a few days, the open files will quickly reach the limit of 1025. I'm just running a simple home server and it's obviously not receiving 1025 concurrent connections so I don't know why it has so many files open. I've tried raising this limit for SSLH to an arbitrary amount but it doesn't seem to matter. Checking the soft limit for user sslh shows that it is way more than 1025
sudo su - sslh -s /bin/bash -c "ulimit -Sn"
Output:100000
I've modified the/etc/security/limits.conf
for user sslh to get to 100000Since the last time this happened (yesterday), I've restarted the service so I can't strace it but next time it happens I will update this issue with the strace.