giampaolo / psutil

Cross-platform lib for process and system monitoring in Python
BSD 3-Clause "New" or "Revised" License
10.21k stars 1.38k forks source link

[Linux] Flag to get connections from all network namespaces #1611

Open davereikher opened 4 years ago

davereikher commented 4 years ago

Currently the psutil.net_connections() function on Linux returns only connections found in the same network namespace as the Python process using psutil. That is because in the psutil.Connections.process_inet() method, psutil reads from /proc/net, which is a symlink to /proc/[PID]/net where [PID] is the pid of the python process. According to the network namespaces man page:

Network namespaces provide isolation of the system resources associated with networking: network devices, IPv4 and IPv6 protocol stacks, IP routing tables, firewall rules, the /proc/net directory (which is a symbolic link to /proc/PID/net), the /sys/class/net directory, various files under /proc/sys/net, port numbers (sockets), and so on. In addition, network namespaces isolate the UNIX domain abstract socket namespace (see unix(7)).

I propose to add a boolean keyword argument all_namespaces to net_connections() to allow the user to choose to retrieve connections from the same network namespace or from all network spaces. If people agree, I am happy to submit a pull request :)

giampaolo commented 4 years ago

I'm not sure I understand what this is for (first time I hear about network namespaces). What's the use case? What the PR would look like? In general, I'd say I'm against an argument which makes sense on 1 platform only.

davereikher commented 4 years ago

I stumbled upon this issue while contributing to the Rust clone of psutil. I was working on adding a feature for listing the connections per process, which is currently missing there and for debugging I used the output of psutil for comparison. I noticed that the rust code listed pids and associated connections which were missing from the psutil output. I was also unaware of network namespaces back then, so I did some digging. Network namespaces is a means in Linux to give a different view of the network stack to different processes. You could, for example, run two different processes with different routing tables, network interfaces, etc. Currently psutil for Linux lists only connections of processes running in the same network namespace as the Python process, for the reason stated in my above post. To show this, you can make the following experiment:

Docker runs processes in it's containers in separate network namespaces (I think namespace per container). Ideally, psutil should show all connections for all processes in the system, so it should show the connections made from your computer to the outside world, even if they originate from within a Docker container (eventually it's boiled down to some pid having an associated connection open, which should be registered by psutil). if you SSH somewhere from the terminal outside of a docker container, psutil will show that connection. However, if you SSH somewhere from within of a docker container, this connection will be invisible to psutil, if psutil runs outside the container. It is, however visible when I run my Rust code and it's easy to verify that it in fact exists. To find the process with the container's open ssh connection, outside the container run docker inspect <container id> | grep -i pid, which will give you the pid of the containerized bash terminal, then run cat /proc/<pid>/net/tcp, which will contain all tcp4 connections in the container's network namespace. Take the inode from there (should be a single line), then run ls -la /proc/*/fd | grep -B 20 <inode> to find the pid of the containerized ssh process, go to the corresponding entry in /proc and you'll see that there is a socket file in the fd folder with this inode, meaning that that process has an open connection, but it won't show in psutil.

The solution and thus the pull request is pretty simple - for e.g. tcp, instead of listing connections from /proc/net/tcp we list them from /proc/net/<pid>/tcp for each process, because processes can be located in different namespaces. This covers all existing namespaces.

Regarding platforms, I'm not sure about other platforms and I don't know if there is an equivalent for network namespaces in Windows for example, but if the flag which is specific to Linux is the issue, we can just fix that issue for Linux by default without introducing the flag at all. I believe that this is an issue since it's not only limited to docker containers but to any process running in a separate network namespace. This behaviour could be very frustrating for a developer using psutil, since also I don't believe this is documented.

What do you think?

giampaolo commented 4 years ago

psutil should show all connections for all processes in the system, so it should show the connections made from your computer to the outside world, even if they originate from within a Docker container

I'm not sure I agree with this. I see Docker not different from VirtualBox or VMWare. If psutil is run in a container / virtual-OS / guest-OS or whatever, then I would expect that the connections of the main "host" OS are not listed.

Note: this can partially be tweeked by setting psutil.PROCFS_PATH to a location different than /proc (e.g. a network mount).

I noticed that the rust code listed pids and associated connections which were missing from the psutil output

psutil.net_connections() does report the PIDs; the only limitation is that some of them are set to None unless running as root (and netstat command has the same limitation BTW).

davereikher commented 4 years ago

I'm not sure I agree with this. I see Docker not different from VirtualBox or VMWare. If psutil is run in a container / virtual-OS / guest-OS or whatever, then I would expect that the connections of the main "host" OS are not listed.

But if psutil is run outside of, e.g. VirtualBox, wouldn't it be reasonable to assume it would show that there is a connection associated with the Virtualbox? Please correct me if I'm wrong and I don't have Windows handy so I can't verify it right now, but I think this might be the case there. If you run VMWare on Windows, connect to the internet from within VMWare and run psutil alongside it, outside of VMWare, wouldn't it detect that connection? If the answer is yes, this behaviour is inconsistent between OS's. I agree that psutil running from within a container should not be aware of anything happening outside the container, but I think that psutil running alongside VMWare/Virtuabox/Docker should be aware of connections made by them.

psutil.net_connections() does report the PIDs; the only limitation is that some of them are set to None unless running as root (and netstat command has the same limitation BTW).

Are you sure? I ran all my tests running as root and a connection made from a docker container is completely invisible to psutil, while it does appear in the /proc/<pid>/net/tcp and /proc/<pid>/fd of that pid. Also, psutil's code accesses /proc/net and not /proc/<pid>/net, which in fact lists only processes in the network namespace of the current Python process (it's a symlink to /proc/<pid of current process>/net), so it by definition can't detect connections outside of it's network namespace.

In any case, if you disagree with the change it should at least be mentioned in the documentation, since it might cause confusion for Linux users.

davereikher commented 4 years ago

By the way, I used Docker just as a convenient tool to create a separate network namespace in order to demonstrate this behaviour. I could've manually created a new network namespace and run a process inside it without using Docker at all and psutil run in another network namespace would not detect connections coming from that process. This would result in a situation where the pid of that process is detected by psutil, but the connections the process with that pid makes are hidden from it.

giampaolo commented 4 years ago

Assuming: host = the "real" OS guest = docker ...I suppose psutil running on host will list the (internet) connections opened by guest. Have you checked if this is the case? If not and your patch fixes that then I'd say we should do it.

giampaolo commented 4 years ago

...on the other hand, I'd say psutil running on "guest" should NOT list the connections opened by "host".

davereikher commented 4 years ago

Assuming: host = the "real" OS guest = docker ...I suppose psutil running on host will list the (internet) connections opened by guest. Have you checked if this is the case? If not and your patch fixes that then I'd say we should do it.

Yes, on Linux, psutil running on the host doesn't list connections opened by the guest and my patch aims to fix that.

...on the other hand, I'd say psutil running on "guest" should NOT list the connections opened by "host".

Agreed.

giampaolo commented 4 years ago

Yes, on Linux, psutil running on the host doesn't list connections opened by the guest and my patch aims to fix that.

OK, let's proceed with a PR then. I only have one doubt: the PIDs of such connections. But let's see the patch first so I can have something to look at.

davereikher commented 4 years ago

I only have one doubt: the PIDs of such connections.

Well, it eventually boils down to a process (not necessarily Docker, can be any small process) which has an associated connection with it, but because it's in a separate network namespace, it's invisible to psutil and psutil doesn't report connections associated with that PID, so the PID of that connection after the patch would then be the PID of that small process.

I did some experimenting - processes that run inside Docker are actually visible outside of docker. For example, you can run inside any docker container sleep 67 and then run outside the container ps -ef | grep "sleep 67", you'll see the process. Psutil would also see that process, however psutil would not find connections associated to those processes since docker makes sure those processes run in a separate namespace. So after the patch, the PIDs associated to the newly discovered connections will be of those processes.

In the case of VirtualBox/VMWare and the likes, I'm not sure how they manage connections to the outside world, so I cannot say, but that's not the main issue here, which is just the unjustified invisibility of other network namespaces.

OK, let's proceed with a PR then.

I'm writing a test to demonstrate the bahvior so once that's done I'll submit a WIP PR.

OSP123 commented 4 years ago

@davereikher Any progress on the PR? Currently running into the same exact problem on Ubuntu.

davereikher commented 4 years ago

@OSP123 If this functionality is critical you can use commit dfd814a. I tested that it's working by opening an ssh connection from within a docker container and making sure psutil detects it from outside of the container. I still need to add a test or two to finish the PR though.

OSP123 commented 4 years ago

We found a workaround for now, but we would prefer using an actual fix from psutil. We can refactor once the code is merged. Thanks!

mariuspod commented 3 years ago

@OSP123 I've just ran into a similar problem when trying to read from a docker host the connections from a process started inside a docker container:

for proc in psutil.process_iter(['pid', 'name', 'username']):
    print(proc.connections())

This returns an empty array for a process running in a docker container. Running the same process on the host returns the connections tuple.

Maybe that's related ?

Torxed commented 2 years ago

This goes for psutil.net_if_addrs() too.

import psutil, pprint

pprint.pprint(psutil.net_if_addrs())

{
    'enp4s0': [...],
    'enp9s0': [...],
    'lo': [...],
    'tap-test': [...],
    'wlan0': [...]
}

But lacks the ability to retrieve:

# ip netns exec test ip addr

1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: tap0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether d6:ae:3d:07:c6:ec brd ff:ff:ff:ff:ff:ff

I would assume psutil to either show all interfaces, or support psutil.net_if_addrs(namespace="test"). I get that this is a "linux only" thing in the example above, but Windows have similar functionalities that could pose a problem too, for reference:

To get interface information, you currently have to avoid using psutil or spawn a separate process for each namespace to gather the interface information that you require. Instead of being able from a single PID "jump around" and gather the information.