Open nigoroll opened 3 years ago
The vsmw code makes the assertion common in varnish that there is enough memory available, but also that we have enough disk space and file descriptors available.
I think these assertions are reasonable, unless we would be willing to add a lot of new error handling, which I think we should not.
So while I consider it reasonable to require administrators to configure and monitor their systems appropriately with respect to memory and disk space, I see an issue with the assertions on available file descriptors: The administrator can not do anything about them, and I think the root cause for this issue is that we accept new connections until we run out of file descriptors.
So at this point I think that the only way to handle this issue (and probably related ones which might be lurking in the code) is to reserve some amount of file descriptors.
I am not sure what the best way would be implement such a limit. The only idea I have so far is to ensure that the number of client connections plus the total number of backend connections stay below getrlimit(RLIMIT_NOFILE)
minus a value/percentage of file descriptors configured as reserved for other uses.
(edit: update to better parameter names, clarification)
from discussion with @Dridi and @bsdphk :
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_14
All functions that open one or more file descriptors shall, unless specified otherwise, atomically allocate the lowest numbered available (that is, not already open in the calling process) file descriptor at the time of each allocation. Where a single function allocates two file descriptors (for example, pipe() or socketpair()), the allocations may be independent and therefore applications should not expect them to have adjacent values or depend on which has the higher value.
The proposed parameter is max_fd_client
:
RLIMIT_NOFILE
.RLIMIT_NOFILE
.We could also have max_fd_backend
with the same semantics when we need it, but we would start with max_fd_client
only.
These parameters would ensure that any file descriptor above the maximum of both limits would be available for internal housekeeping, vmods and any other code besides client and backend core code.
These limits would not be a guarantee:
RLIMIT_NOFILE
could change and the limit would be adjusted with some delayExcept for the question if we should impose a delay before we get close to the limit, this ticket is ready for implementation. In the first iteration, the acceptor would just treat any file descriptor above the max as an error, close it and pace as with other accept errors.
Under high load with hundreds of thousands of requests serializing on a small number of objects, for which the backend returned an error with a long response time, the following Panic was observed, triggered by exhaustion of available file descriptors. The Panic output is deliberately shortened to avoid publication of identifiable information, but I am convinced that it contains sufficient information for identifying and fixing the bug.