Open timotheecour opened 3 years ago
The problems referenced can't be properly solved with something like getOpenFiles
. An attempt to do so would be unreliable, as there is no way to prevent file descriptors from being allocated by other threads between check and allocation. Additionally, from a standard library perspective, such a routine would have relatively few correct uses, and a fair number of incorrect uses.
The problems referenced can't be properly solved with something like getOpenFiles
I disagree, an application can use getOpenFiles
to gracefully throttle/rate limit connections and adapt dynamically to traffic and system resource usage. For example, it can refuse a request if number of in-flight connections * max fd per connection + getOpenFiles.len > fdlmiit
, or other internal application logic.
The proper solution in this case is to handle the case where the file descriptor limit has been reached.
you have to do that anyways, but it typically means aborting in-flight connections, and it's worse than rate limiting when approaching limits or refusing new connections, as illustrated in https://github.com/nim-lang/Nim/issues/18161#issuecomment-855295197. Less chances of broken transactions and data corruptions.
silently handling file descriptor exhaustion. If I am administrating a web service, I want to know when system resources are being exhausted.
not sure how it relates to this RFC, you can use getOpenFiles
for server health reporting
I disagree, an application can use getOpenFiles to gracefully throttle/rate limit connections and adapt dynamically to traffic and system resource usage. For example, it can refuse a request if number of in-flight connections * max fd per connection + getOpenFiles.len > fdlmiit, or other internal application logic.
Great, but that isn't the problem in any of those referenced issues. It's a feature, and not a very practical one.
In order for such a feature to work:
The result of these limitations would be an overly complex feature that provides little functionality and is tiresome to use. If my application is running out of file descriptors, I either need to stand up a new server, or fix a bug. That's it. This is speaking from experience as someone who has developed backend services for a living.
More on-topic, the problem in those issues is that the async framework isn't handling reaching file descriptor limits correctly. It just raises an exception outside the running async routines, rather than redirecting it into them (or at least, that's the issue that shouldAcceptConnections
tried to fix).
you have to do that anyways, but it typically means aborting in-flight connections, and it's worse than rate limiting when approaching limits or refusing new connections, as illustrated in nim-lang/Nim#18161 (comment). Less chances of broken transactions and data corruptions.
Each in-flight request would need to use approximately the same number of file descriptors
If all you can come up is a minimum of descriptors that's already very good, the average doesn't matter. And the minimum can easily be 1 if you don't know or you cannot know.
Btw I downvoted because bugfixes should not imply new APIs. It helps if an API is only used internally, otherwise the surface area of the Nim stdlib gets bigger and bigger. Not sustainable.
It helps if an API is only used internally,
this is precisely why we have lib/std/private/
, it allows compiler/stdlib/tools/etc to have its own private modules that mature/improve over time without fear of breaking backward compatibility. If/when it's deemed useful enough for more exposure, it can migrate to std/dist/stdx as needed.
so, I re-iterate this RFC with lib/std/private/osutils.nim
(ie, private for now), which can be used inside shouldAcceptRequest
so, I re-iterate this RFC with lib/std/private/osutils.nim (ie, private for now), which can be used inside shouldAcceptRequest
Good but then you don't need an RFC at all. ;-) More importantly however, I don't know if the shouldAcceptRequest
logic should stay for it's only us two who value the mechanism.
as @mrhdias mentioned in https://github.com/nim-lang/Nim/pull/18205#issuecomment-859721437, python's psutil module has the ingredients for that, on all supported OSs (see https://psutil.readthedocs.io/en/latest/#psutil.Process.num_fds num_fds for unix-like platforms and num_handles for windows):
import psutil
p = psutil.Process()
print(p.num_fds())
on linux, it uses the same approach as in this RFC, via:
return len(os.listdir("%s/%s/fd" % (self._procfs_path, self.pid)))
on other OS, IIUC avoids shelling out on OSX to lsof -p $pid -F nt -X
(whether shelling out is actually a problem is unclear, at least performance wise seems fine, refs https://github.com/nim-lang/Nim/pull/18205#issuecomment-855522836)
psutil has a lot of generally useful APIs, and yes, we don't want to expose something to stdlib until it matures a bit, so std/private/psutils or std/private/osutils would be the place to start.
proposal
cross-platform
getOpenFiles
to retrieve the opened files for a given processexample use case
this can be used to implement a robust
activeDescriptors
andshouldAcceptRequest
, as follows:in fact this is what i implemented in https://github.com/timotheecour/Nim/pull/750, for linux
osx
linux
list
/proc/self/fd
using C api calls to opendir, readdir, closedir (instead ofls -l /proc/<pid>/fd
), eg https://stackoverflow.com/questions/1121383/counting-the-number-of-files-in-a-directory-using-c?noredirect=1&lq=1windows
people more familiar with windows could expand on that
GetProcessHandleCount
lsof