nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

`iterator getOpenFiles*(p: Pid): FileEntry` #382

Open timotheecour opened 3 years ago

timotheecour commented 3 years ago

proposal

cross-platform getOpenFiles to retrieve the opened files for a given process

# in std/os or std/osproc or std/osutils:
type OpenFileOpts = enum ... # what fields to retrieve
type FileEntry = object
  fd: int
  path: string
  mode: Mode # w,r,rw
  fileType: FileType # character special, regular, etc

iterator getOpenFiles*(p = getCurrentProcessId(), opts = initOpenFileOpts()): FileEntry

example use case

this can be used to implement a robust activeDescriptors and shouldAcceptRequest, as follows:

proc shouldAcceptRequest(): bool =
  let tot = count(getOpenFiles())
  result = tot + maxDescriptorsPerRequet < getNumOpenFilesMax()

in fact this is what i implemented in https://github.com/timotheecour/Nim/pull/750, for linux

osx

lsof -p $pid -F nt -X

linux

list /proc/self/fd using C api calls to opendir, readdir, closedir (instead of ls -l /proc/<pid>/fd), eg https://stackoverflow.com/questions/1121383/counting-the-number-of-files-in-a-directory-using-c?noredirect=1&lq=1

windows

people more familiar with windows could expand on that

lsof

Varriount commented 3 years ago

The problems referenced can't be properly solved with something like getOpenFiles. An attempt to do so would be unreliable, as there is no way to prevent file descriptors from being allocated by other threads between check and allocation. Additionally, from a standard library perspective, such a routine would have relatively few correct uses, and a fair number of incorrect uses.

https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

timotheecour commented 3 years ago

The problems referenced can't be properly solved with something like getOpenFiles

I disagree, an application can use getOpenFiles to gracefully throttle/rate limit connections and adapt dynamically to traffic and system resource usage. For example, it can refuse a request if number of in-flight connections * max fd per connection + getOpenFiles.len > fdlmiit, or other internal application logic.

The proper solution in this case is to handle the case where the file descriptor limit has been reached.

you have to do that anyways, but it typically means aborting in-flight connections, and it's worse than rate limiting when approaching limits or refusing new connections, as illustrated in https://github.com/nim-lang/Nim/issues/18161#issuecomment-855295197. Less chances of broken transactions and data corruptions.

silently handling file descriptor exhaustion. If I am administrating a web service, I want to know when system resources are being exhausted.

not sure how it relates to this RFC, you can use getOpenFiles for server health reporting

Varriount commented 3 years ago

I disagree, an application can use getOpenFiles to gracefully throttle/rate limit connections and adapt dynamically to traffic and system resource usage. For example, it can refuse a request if number of in-flight connections * max fd per connection + getOpenFiles.len > fdlmiit, or other internal application logic.

Great, but that isn't the problem in any of those referenced issues. It's a feature, and not a very practical one.
In order for such a feature to work:

The result of these limitations would be an overly complex feature that provides little functionality and is tiresome to use. If my application is running out of file descriptors, I either need to stand up a new server, or fix a bug. That's it. This is speaking from experience as someone who has developed backend services for a living.

More on-topic, the problem in those issues is that the async framework isn't handling reaching file descriptor limits correctly. It just raises an exception outside the running async routines, rather than redirecting it into them (or at least, that's the issue that shouldAcceptConnections tried to fix).

you have to do that anyways, but it typically means aborting in-flight connections, and it's worse than rate limiting when approaching limits or refusing new connections, as illustrated in nim-lang/Nim#18161 (comment). Less chances of broken transactions and data corruptions.

Araq commented 3 years ago

Each in-flight request would need to use approximately the same number of file descriptors

If all you can come up is a minimum of descriptors that's already very good, the average doesn't matter. And the minimum can easily be 1 if you don't know or you cannot know.

Araq commented 3 years ago

Btw I downvoted because bugfixes should not imply new APIs. It helps if an API is only used internally, otherwise the surface area of the Nim stdlib gets bigger and bigger. Not sustainable.

timotheecour commented 3 years ago

It helps if an API is only used internally,

this is precisely why we have lib/std/private/, it allows compiler/stdlib/tools/etc to have its own private modules that mature/improve over time without fear of breaking backward compatibility. If/when it's deemed useful enough for more exposure, it can migrate to std/dist/stdx as needed.

so, I re-iterate this RFC with lib/std/private/osutils.nim (ie, private for now), which can be used inside shouldAcceptRequest

Araq commented 3 years ago

so, I re-iterate this RFC with lib/std/private/osutils.nim (ie, private for now), which can be used inside shouldAcceptRequest

Good but then you don't need an RFC at all. ;-) More importantly however, I don't know if the shouldAcceptRequest logic should stay for it's only us two who value the mechanism.

timotheecour commented 3 years ago

as @mrhdias mentioned in https://github.com/nim-lang/Nim/pull/18205#issuecomment-859721437, python's psutil module has the ingredients for that, on all supported OSs (see https://psutil.readthedocs.io/en/latest/#psutil.Process.num_fds num_fds for unix-like platforms and num_handles for windows):

import psutil
p = psutil.Process()
print(p.num_fds())

on linux, it uses the same approach as in this RFC, via:

return len(os.listdir("%s/%s/fd" % (self._procfs_path, self.pid)))

on other OS, IIUC avoids shelling out on OSX to lsof -p $pid -F nt -X (whether shelling out is actually a problem is unclear, at least performance wise seems fine, refs https://github.com/nim-lang/Nim/pull/18205#issuecomment-855522836)

psutil has a lot of generally useful APIs, and yes, we don't want to expose something to stdlib until it matures a bit, so std/private/psutils or std/private/osutils would be the place to start.

Varriount commented 3 years ago

As I stated here, that psutils isn't part of the Python standard library, likely because retrieving a list of all the file descriptors from a given process is a quite rare requirement.