litespeedtech / openlitespeed

Our high-performance, lightweight, open source HTTP server
https://openlitespeed.org
GNU General Public License v3.0
1.2k stars 194 forks source link

OpenLiteSpeed and possible file descriptor leak #188

Closed franciscopaniskaseker closed 3 years ago

franciscopaniskaseker commented 4 years ago

I have one customer that the server reach the file descriptor limit 3 times. Every time I increased the limit to try to identify which user is doing that, as we have a lot of sites with very very low traffic.

Today I discovered that openlitespeed, running under nobody user, is leaking file descritors. See the last 12 hours file descriptor count:

nobody:7783
nobody:19223

The openlitespeed server is the unique service running under nobody user.

In stderr.log I am seeing a lot of

2020-07-06 10:38:57.926 [STDERR] [480752] Reached max children process limit: 10, extra: 3, current: 13, busy: 11, please increase LSAPI_CHILDREN.

Which is really strange, because that server has really low traffic, so max children 10 is a good limit... but ok, I will increase that. But one of my questions is if this problems is relationated with file descriptor leak.

In error.log I am only seeing some errors on mysql databases that can not be relationated with file descriptor leaks.

Some server details: CentOS Linux release 7.8.2003 (Core) openlitespeed-1.6.13-1.el7.x86_64

I can update the openlitespeed, however the changelog of the newest does not seems to has anything relationated with my problem ( https://openlitespeed.org/release-log/version-1-6-x/ ).

How to debug that problem?

litespeedtech commented 4 years ago

This is my first seeing such issue. If you can create a ticket, our support team may have a look, to check if this is normal or not. Thanks David

franciscopaniskaseker commented 4 years ago

Happened again today. And again was nobody user.

[root@mail home]# users=$(awk -F: '{ print $1}' /etc/passwd)
[root@mail home]# for user in $users; do  echo -n "${user}:"; lsof -u $user | wc -l; done
nobody:100201

I opened a ticket about that (long time ago), but no answeer.

Maybe can be some site problem, however openlitespeed can not goes dowb due some site fault. All sites are gettin 503 due "too many open files" error.

franciscopaniskaseker commented 4 years ago

From both openlitespeed threads under nobody user.

[root@mail fd]# ls -l | wc -l
49999
[root@mail fd]# pwd
/proc/6366/fd
[root@mail fd]# cd /proc/6367/fd/
[root@mail fd]# ls -l | wc -l
50000
[root@mail fd]# 
[root@mail fd]# ls -l /proc/6367/fd/15766 
lrwx------. 1 nobody nobody 64 Sep  8 11:02 /proc/6367/fd/15766 -> socket:[757348056]
[root@mail fd]# ls -l /proc/6367/fd/1692
lrwx------. 1 nobody nobody 64 Sep  8 11:02 /proc/6367/fd/1692 -> socket:[741016970]
[root@mail fd]# 
[root@mail fd]# lsof -n -l +D '/usr/local/lsws/cachedata' | wc -l
13
[root@mail fd]# 

Same problem here:

franciscopaniskaseker commented 4 years ago

Today it happened again and I can see the same nobody user with most of all file descriptors.

litespeedtech commented 3 years ago

Hi,

Does it still happen with the the latest release 1.6.19 or 1.7.8? I may need to look into this.

litespeedtech commented 3 years ago

Please reopen this issue if still have this problem.

franciscopaniskaseker commented 3 years ago

Since 1.6.19 seems to be fixed.

litespeedtech commented 3 years ago

Thanks for the confirmation.