traviscross / mtr

Official repository for mtr, a network diagnostic tool
http://www.bitwizard.nl/mtr/
GNU General Public License v2.0
2.6k stars 334 forks source link

Adjust capability handling #487

Closed flu0r1ne closed 9 months ago

flu0r1ne commented 9 months ago

This PR addresses issues related to capability handling in mtr-packet, as detailed in issue #485. It completely resolves Bug 1 and partially resolves Bug 2 by ensuring that CAP_NET_RAW and CAP_NET_ADMIN are retained in the permitted set. These capabilities are elevated to the effective set when necessary. This eliminates "permission denied" errors encountered when specifying a local interface using the --interface flag or setting a packet mark with the -M option. To the best of my knowledge, these issues are specific to Linux.

The commits have been tested on Linux version 6.1.51.

flu0r1ne commented 9 months ago

I am still working on cross-platform testing this PR. Please wait until I test it further before merging . Code review would be appreciated.

rewolff commented 9 months ago

As for "code review"... Just done that: This is the RIGHT way to do this kind of stuff: Just place a call to "just do it" in the code, and hide the ifdefs in the implementation of that call. There is way too many ifdefs in the main program flow. (e.g. for ipv6. I would prefer it if the main code just reads:

if (IPV6_ENABLED) create_packet_ipv6 ();
else                          create_packet_ipv4 ();

and then the IPV6_ENABLED is always 0 if ipv6 is not available, and returns config.ipv6_enabled if it IS available.

rewolff commented 9 months ago

About the cross platform testing: I created a virtualbox machine with freebsd on my workstation. Doing the same should give you the capability to test on freebsd....

flu0r1ne commented 9 months ago

I manually tested the changes on both FreeBSD 13.2 and Linux 6.1, and also ran them against the automated Linux tests in PR #488.

First, I tested the code on Linux 6.1. I specifically bound mtr to a device through which traffic would not usually be routed. For example, the default gateway would normally send packets to a different device. The test traced the expected network path.

./bootstrap.sh
./configure
make
MTR_PACKET="./mtr-packet" sudo -E ./mtr -t -I wlp5s0 1.1.1.1

Next, I configured the code on FreeBSD 13.2. To the best of my knowledge, SO_BINDTODEVICE, SO_MARK, and libcap are only supported on Linux. Therefore, FreeBSD can effectively serve as a stand-in for other non-Linux hosts, since these execution paths will be omitted.

./bootstrap.sh
./configure --without-gtk --without-jansson
make
MTR_PACKET="./mtr-packet" sudo -E ./mtr 1.1.1.1
MTR_PACKET="./mtr-packet" sudo -E ./mtr -I vtnet0 1.1.1.1
MTR_PACKET="./mtr-packet" sudo -E ./mtr -I lo1 127.0.0.1

The test again traced the expected path.

Finally, I validated the changes against the automated Linux tests from PR #488. The test_interface_binding test now succeeds.

git switch -c adjust-capability-handling-test
git merge adjust-capability-handling
git merge linux-tests
make clean
make
sudo python ./test/linux/netemtests.py

This confirms that the modifications are working as expected across different platforms and scenarios.

flu0r1ne commented 9 months ago

Regarding the use of ifdefs, I concur that they can make the code more difficult to read and reason about. However, I believe most of them are necessary.

For HAVE_LIBCAP, which is undefined if Autoconf doesn't detect it, I can make it toggleable by incorporating an if statement within the ifdef, as follows:

#ifdef HAVE_LIBCAP
    if (HAVE_LIBCAP) {
        drop_excess_capabilities();
    }
#endif

Alternatively, to make it even more concise:

#if (HAVE_LIBCAP == 1)
    drop_excess_capabilities();
#endif

Since drop_excess_capabilities will only be used when HAVE_LIBCAP is defined, it seems reasonable to continue conditionally compiling it out when HAVE_LIBCAP is not present to avoid warnings.

SO_BINDTODEVICE and SO_MARK are not managed by Autoconf, so I don't believe your comment applies to these ifdefs.

As for improving clarity, I can introduce an enum to manage types instead of using ifdefs in construct_unix.c. The enum would define a common interface that elevates privileges on platforms with LIBCAP and acts as a no-op on platforms without it. I would then map these enum values to the actual capabilities in cases where the system supports them. Here is the current code:

/*
    This defines a common interface which elevates privileges on
    platforms with LIBCAP and acts as a NOOP on platforms without
    it.
*/
#ifdef HAVE_LIBCAP

typedef cap_value_t mayadd_cap_value_t;
#define MAYADD_CAP_NET_RAW CAP_NET_RAW
#define MAYADD_CAP_NET_ADMIN CAP_NET_ADMIN

#else /* ifdef HAVE_LIBCAP */

typedef int mayadd_cap_value_t;
#define MAYADD_CAP_NET_RAW ((mayadd_cap_value_t) 0)
#define MAYADD_CAP_NET_ADMIN ((mayadd_cap_value_t) 0)

#endif /* ifdef HAVE_LIBCAP */

I hope this addresses your concerns and I would appreciate any further feedback. I plan to make one additional change related to error handling. After receiving your follow-up comments, I'll make any necessary revisions, rerun the manual tests, and then the code should be ready for merging.

rewolff commented 9 months ago

I was not criticizing you for your use of ifdefs. You were doing it mostly right in the parts that I scanned. But your example can be used as an example:

I would like to have the main code read:

  drop_excess_capabilities();  // this evaluates to a noop if we don't have LIBCAP. 

Then in the implementation of "drop_excess_capabilities" there would be the ifdef HAVE_LIBCAP. Or where the

20 years ago, that would have meant a call to an empty function. 20 years ago I would've said: if it makes the code more readable, it is worth the extra/empty function call. Nowadays the compiler will remove the call during the linking stage.

Don't optimize for run-time, optimize for readabilty. Early nineties I worked on a project optimized for runtime: We knew in advance timing was going to be tricky. Turns out the optimized code (written before my involvement in the project) "didn't work". I rewrote for readability: All code got about 3x slower. Then I optimized just the ONE function that was taking most of the time (80% of available time after optimization). And then it worked. And within the real-time constraints that we had. The readability aspect of the code is WAY more important than the runtime.