dtrugman / pfs

Parsing the Linux procfs
Apache License 2.0
99 stars 33 forks source link

Automatically getting sockets for only single process #38

Closed rm5248 closed 11 months ago

rm5248 commented 11 months ago

When using task::get_net, all of the network sockets that exist are returned, not just the ones for the particular process that we are looking at. I'm not sure if this is intentional or not, since /proc/<task>/net is the same as /proc/net. All of the other getters for the task class only get data for that particular task.

It would be nice if pfs could automatically filter what it returns for each particular process, so that you only get the sockets for the particular task that you are looking at. I do this manually, by checking all of the inode numbers and checking if the socket is used by this process or not based off of the inode number.

dtrugman commented 11 months ago

As you mentioned yourself, this is how Linux implements it. When looking at files under /proc/<task>/net you see all the sockets in the same network namespace.

I understand it might be convenient for a user to get all the sockets for a process, like netstat, but the goal of the library is first and foremost imitate the /proc filesystem as closely as possible.

Another significant reason for not adding something like this to the library is that even when implemented perfectly from the user-space, socket enumeration for a process is a racy operation. Since getting all the networks sockets and all the fds is not an atomic operation, things are bound to be missing. The library should not be held responsible for something like that, but rather the user.

At the moment I'm leaning towards adding the implementation as a "sample" tool, so that users can easily re-implement it in their project, while making sure the core library is clean and not opinionated, just like the Linux procfs itself.

Created #40 as a possible resolution.

WDYT?

rm5248 commented 11 months ago

Are you intending to provide "exactly what the kernel provides" or "here's a convenient interface"? The latter can clearly be implemented using the former, the question is does this belong in the library or not? In my mind there's more value in doing both.

My opinion is that it should be part of the library, but perhaps not as part of the task class. Some higher-level class(pfs::LogicalProcess?) could implement these higher-level operations. If there is no higher-level class, at least some utility methods would probably be useful in order to collect various pieces of information(all of the sockets of a process being important for this example). That way the core of the library still does exactly what the kernel provides, but with some common operations defined so as a user you don't need to find and/or re-implement these every time.

Some thoughts:

dtrugman commented 11 months ago

Frankly, I added a similar proposal to my previous comment, but decided to remove it later. Mostly because, as I mentioned, there are many possible permutations to what a user might try and achieve.

If you think of netstat, lsof and ss, the amount of feature flags is just crazy, and not without a reason. I am wondering if it possible to add this functionality without maintaining a huge list of scenario-specific "modes".

I understand that this might be useful for users, and I'm trying to balance both here. Let me try and play with it. I'll try and create a new set of utilities, that would be orthogonal to the procfs class (i.e. not part of the core functionality just like you mentioned), providing convenient interfaces, and update the PR I created. We can talk about the results and find something that works. K?

rm5248 commented 11 months ago

That sounds good to me.

I did think about this a bit over the past few days, have you thought about doing a fluent API? In my particular use-case, I am just looking for the processes that have open sockets and where those sockets are connected to. So it could look something like the following(note: these examples assume there is a higher-level Process class defined here that wraps the pfs::task that would provide some useful getters such as the sockets):

vector<Process> procs = ProcessFinder().withOpenSockets().find();

for( Process proc : procs ){
    for( Socket sock : proc.sockets() ){
         // do something here...
    }
}

But a more complex use-case could be done too. Here are some ideas on what that could look like:

// All processes of a certain EXE:
ProcessFinder().withExeName("/path/to/exe").find();

// All processes that have a certain file open
ProcessFinder().withOpenFile("/tmp/foobar").find();

// Processes of a certain name that have open files
ProcessFinder().withExeName("/path/to/exe").withOpenFile("/tmp/foobar").find();

// Processes that are owned by a particular user and have more than 4096kB of resident memory
// Use a lambda as a method to help filter the processes if there is not already a selector defined
ProcessFinder().ownedByUser(1000).with([](Process proc){ 
    if(proc.proc().get_status().vm_rss > 4096 ) return true; 
    return false; 
} ).find(); 
dtrugman commented 11 months ago

I've spend quite some time, trying to find a solution I'm happy with. Unfortunately, every solution I considered, was still very easy to misuse, and I'll try to explain why.

I considered two approaches: 1) adding an orthogonal utilities class and 2) adding a process / nework finder with a functional builder syntax. Both suffer from the same problems.

First, I'll say that the functional builder style is very nice:

auto procs = proc_finder().with_open_file("/file/in/use").find();

But, it becomes very error prone and very fast.

Problem no. 1: Inconsistency/performance issues due to lack of caching

Let's consider the following example:

auto procs = proc_finder().with_open_file("/file1").with_open_file("/file2");

It looks very innocent, but, in fact, it is not. Every call to with_open_file reads and parses the open FDs again, hence, we can get a huge performance penalty, and face very inconsistent and racy results. The core reason for that, is that all pfs classes are static utilities and do not handle any type of caching. Even a simple operation as reading all the open FDs of a process will call readlink again and again and again.

Problem no. 2: Values are not captured

It becomes even worse when we're considering network enumeration. Let's say, I want to find all the processes with a listening TCP socket. A very nice syntax could be:

auto procs = proc_finder().with_listening_tcp_socket();

But, if you try and list all the sockets of the resulting procs, you won't find any listening socket, because it was already closed. This kind of issues will lead to a debugging hell.

Problem no. 3: Number of permutations

Another problem is the huge number of permutations related to network connections:

Bottom line

I am a big fan of APIs that are "easy to use, and very hard to misuse".

I will try and figure out a nice and generic solution to implementing "cached" versions of "task" and "net", but I don't think there's going to be an easy solution for that, and I'm not sure if users would need that or not. If I ever implement those, adding a proc_finder builder pattern will be easy to implement and a pleasure to use.

In the meantime, I will do the following, which should make it much easier for users to search for sockets:

  1. I will add a method to get all the inode numbers for a task's open FDs
  2. Change the net interface to accept filters. These would allow users to find only sockets matching a precondition. One of those filters can be an "inodes filter", and another could be a socket state, i.e. listening. This way users can search for sockets in a specific state owned by a specific process, while solving caching, consistency and permutation issues.
dtrugman commented 11 months ago

Updated the pull request #40. Feel free to let me know what you think :)

dtrugman commented 11 months ago

Closing after it seems like we reached an agreement over #40.