Support network namespaces

dkerr64 commented 3 years ago

I would find it very useful to monitor interfaces in a specific network name space. For example if I have multiple containers, I could run vnstat (may need to be as root) on the host and monitor traffic inside the container(s). For example from host command line I can run ip netns exec <netns> cat /sys/class/net/eth0/statistics/rx_bytes etc.

So, if vnstat could accept a network namespace along with the interface name that would be cool...

--iface eth0 --netns <netns>

Thanks!

vergoh commented 3 years ago

I'll have to investigate a little bit more how these network namespaces behave from the host point of view and especially how to query a list with all the relevant interfaces if the host doesn't have direct visibility. Running ip commands isn't an option. Just to be clear, since I don't have any network namespaces in use (that I'm aware of) in the systems I regularly use, this request may not get touched any time soon.

If it's Docker that you are using, it does appear to create on the fly new container and possibly network namespace specific interfaces which vnStat can monitor from the host side.

dkerr64 commented 3 years ago

I cited the ip command just as illustration, I wouldn't expect you to use those. I'm guessing (but have not looked in your source to confirm) that you are simply reading from the network statistics "files" maintained by the kernel.

My use case is LXC (but I suspect very similar to Docker). An LXC container will run in its own namespace but has various options to setup networking. If configured to use veth than a linked virtual device is created on the host vethXXXXXX to which the container's eth0 (or whatever) is connected. The veth device is in the host network namespace (so vnstat could easily monitor that), eth0 is in the containers network namespace.

But, LXC also supports attaching a physical device into the container. So let's say you have a real NIC eth2 on the host, LXC can move that into the containers namespace. Soon as that is done, it cannot be seen from the host namespace. Another scenario is WireGuard VPN where you can configure a VPN on the host and then move e.g. wg0 interface into the containers namespace... see here. The container has no idea that the interface is actually a VPN, and once moved it cannot be seen from the host namespace.

As I said I have not read through vnstat source, but switching a C thread into another network namespace appears to be fairly easy. But of course how it might work for vnstat needs some discovery, a separate thread would need to be created for each network namespace that vnstat monitors.

dkerr64 commented 3 years ago

Available network namespaces on a host can be found by ls /var/run/netns. That directory is not populated by default, but anyone messing around with network namespaces will have created the netns inside that directory (pointing to /proc/<PID>/ns/net. I'm not sure vnstat should go to a lot of effort to list all available interfaces inside all network namespaces. The user should be required to provide the name as parameter. But /var/run/netns is the place to start -- the ip command requires it, so if a prerequisite for it, then can be for vnstat as well.

A design issue may be dealing with multiple interfaces of the same name. eth0 can exist in multiple namespaces and all are different. So in vnstat's database the interface name would need to be associated with a namespace.

And to make matters worse... a namespace name could change every reboot. Bad practice would be to use the PID of the container as the network namespace in /var/run/netns. That would be bad as every reboot the PID for the container could be different, thus messing up vnstat database history. vnstat cannot be expected to handle this, so when a namespace is setup in /var/run/netns the user really needs to use their own specified name rather than the PID. This is a user responsibility, not vnstat's, but probably something to be documented.

dkerr64 commented 3 years ago

I did a little bit of playing around...

I added-n / --netns and modified --iflist. This is a pure hack (--netns must come before --iflist) but illustrates what is possible. I created a LXC container, setup a WireGuard VPN in the host and moved the wg2 interface into the container; LXC does the move for me because I configured..

lxc.net.1.type = phys
lxc.net.1.link = wg2

So inside the container I have eth0 and wg2, outside on the host I have numerous interfaces. The modification to --iflistis simple... find my current network namespace and save that, switch to the container namespace, list the interfaces, switch back and list interfaces again, and presto...

# ./vnstat --netns test-lxc --iflist
Available interfaces: eth0 (1000 Mbit) wg2
Available interfaces: veth3RRALJ (10000 Mbit) vethK1SVKX (10000 Mbit) wg0 eth1.100 (10000 Mbit) veth9HP75L (10000 Mbit) eth0 (1000 Mbit) wg3 vethPFM51J (10000 Mbit) lxcbr0 eth2 (10000 Mbit) eth1.300 (10000 Mbit) eth1 (10000 Mbit) vethK2BAI3 (10000 Mbit) eth1.200 (10000 Mbit)
#

Now that is the easy part. The hard part will be modifying vnstat to recognize that interfaces belong to a namespace... everywhere vnstat manages an interface by name, it needs to have the namespace attached to it. so, eth0 on the host is not the same as say test-lxc/eth0 .

Here is the code...

--- vnstat-2.6-orig/src/vnstat_func.c   2021-03-04 17:27:54.514183000 -0500
+++ vnstat-2.6/src/vnstat_func.c    2021-03-04 17:14:05.939237670 -0500
@@ -10,6 +10,8 @@
 #include "cfg.h"
 #include "cfgoutput.h"
 #include "vnstat_func.h"
+#define __USE_GNU
+#include <sched.h>

 void initparams(PARAMS *p)
 {
@@ -41,6 +43,7 @@
    p->xmlmode = 'a';
    p->databegin[0] = '\0';
    p->dataend[0] = '\0';
+   p->netns[0] = '\0';
 }

 void showhelp(PARAMS *p)
@@ -160,6 +163,20 @@
                printf("Error: Interface for %s missing.\n", argv[currentarg]);
                exit(EXIT_FAILURE);
            }
+        } else if ((strcmp(argv[currentarg], "-n") == 0) || (strcmp(argv[currentarg], "--netns") == 0)) {
+           if (currentarg + 1 < argc) {
+               if (strlen(argv[currentarg + 1]) > 63) {
+                   printf("Error: Network namespace name is limited to 63 characters.\n");
+                   exit(EXIT_FAILURE);
+               }
+               strncpy_nt(p->netns, argv[currentarg + 1], 64);
+               if (debug)
+                   printf("Used network namespace: \"%s\"\n", p->netns);
+               currentarg++;
+           } else {
+               printf("Error: Network namespace for %s missing.\n", argv[currentarg]);
+               exit(EXIT_FAILURE);
+           }
        } else if (strcmp(argv[currentarg], "--config") == 0) {
            /* config has already been parsed earlier so nothing to do here */
            currentarg++;
@@ -461,6 +478,32 @@
                p->query = atoi(argv[currentarg + 1]);
                currentarg++;
            }
+           if (strlen(p->netns)) {
+               char netnspath[PATH_MAX];
+               int mynetns = 0;
+               int reqnetns = 0;
+               sprintf(netnspath, "/proc/%d/ns/net",getpid());
+               mynetns = open(netnspath, O_RDONLY);
+               if (mynetns < 0) {
+                   printf("Error: Failed to find my own network namespace.\n");
+                   exit(EXIT_FAILURE);
+               }
+               sprintf(netnspath, "/var/run/netns/%s",p->netns);
+               reqnetns = open(netnspath, O_RDONLY);
+               if (reqnetns < 0) {
+                   printf("Error: Failed to find network namespace FD for %s.\n", netnspath);
+                   exit(EXIT_FAILURE);
+               }
+               if (setns(reqnetns, CLONE_NEWNET)) {
+                   printf("Error: Failed to set network namespace for %s.\n", netnspath);
+                   exit(EXIT_FAILURE);
+               }
+               showiflist(p->query);
+               if (setns(mynetns, CLONE_NEWNET)) {
+                   printf("Error: Failed to return to my own network namespace.\n");
+                   exit(EXIT_FAILURE);
+               }
+           }
            showiflist(p->query);
            exit(EXIT_SUCCESS);
        } else if (strcmp(argv[currentarg], "--dbiflist") == 0) {
diff -Naur vnstat-2.6-orig/src/vnstat_func.h vnstat-2.6/src/vnstat_func.h
--- vnstat-2.6-orig/src/vnstat_func.h   2021-03-04 17:27:54.514183000 -0500
+++ vnstat-2.6/src/vnstat_func.h    2021-03-04 08:44:01.466247307 -0500
@@ -10,6 +10,7 @@
    char interface[32], alias[32], newifname[32], filename[512];
    char definterface[32], cfgfile[512], *ifacelist, jsonmode, xmlmode;
    char databegin[18], dataend[18];
+   char netns[64];
 } PARAMS;

 void initparams(PARAMS *p);

Notes...

netns max length of 64 is arbitrary, I do not know if there is established standard.
file descriptor for vnstat's network namespace can probably be found once and saved.
file descriptor for the container should not be cached... it will change if the container is stopped/started.
And this is probably very linux specific. I do not know how namespaces work on other OS platforms.

Enjoy!

vergoh commented 3 years ago

Thanks for the example. You may also have noticed that vnStat doesn't use threads in its implementation so it would either have to constantly jump between network namespaces or threading be implemented. As for the maximum length of a namespace, I suspect at least some part of the implementation must be in the kernel somewhere and that's likely to be where the maximum length is also defined.

dkerr64 commented 3 years ago

Forgive me if I use this issue just to document continuing thoughts on this topic. Hopefully it will be helpful.

On further research it looks like we cannot assume that network namespaces are always found in /var/run/netns. It appears for example that docker uses /var/run/docker/netns. So, I think the --netns parameter should accept a full pathname, for example --netns /var/run/docker/netns/example. We could assume a default so if a full path is not provided then use a default. I would suggest /var/run/netns as the default path as that is what system ip command uses and documents.

dkerr64 commented 3 years ago

... vnStat doesn't use threads in its implementation so it would either have to constantly jump between network namespaces or threading be implemented.

We don't need to introduce threads, we can just switch between namespaces as required. There should be minimal overhead (unless using really old kernels)

dkerr64 commented 3 years ago

I think the biggest design decision is how to store namespace information internally to vnstat and in the database. I can think of two options... create a new field, or extend the name of an interface to include a path. So, option 1...

netns = '/var/run/netns/example'
interface = 'eth0'

Or, option 2...

interface = '/var/run/netns/example/eth0'

In both cases, if no network namespace is specified then vnstat would operate as it does today... namely netns = undefined or null and interface = 'eth0'.

Whatever is chosen has implications on what is returned in XML or JSON from the database. And any design should be backward compatible with existing programs that pull data from the database. Although we could assume that all existing users of vnstat database are not namespace "aware".

I personally prefer option 2. Thus any program that iterates over interfaces in the JSON or XML is guaranteed a unique field... if option 1 is picked then there could be multiple eth0 in the array, which I think is a problem. If an application is not namespace aware, and a user adds namespace specific interfaces to vnstat database, I think it is better for the application to fail on a interface field that holds a namespace path than to potentially process two eth0 objects unaware that they are different. And there is a api version field that can be incremented to warn programmers that a change has taken place.

Just my 2 cents.

vergoh commented 3 years ago

Looks like one rather easy way of creating network namespaces without much risk of affecting the existing networking is to have docker-compose create a shared network between the configured containers. That brings content visible under /var/run/docker/netns. However, another possible issue is that this directory (or more exactly /var/run/docker) requires root permissions to access, which is something the vnStat daemon in most cases doesn't have.

vergoh / vnstat

Support network namespaces #191