traviscross / mtr

Official repository for mtr, a network diagnostic tool
http://www.bitwizard.nl/mtr/
GNU General Public License v2.0
2.6k stars 334 forks source link

Proposal: Refactor Address Selection Logic for Enhanced Compatibility and Testability #491

Open flu0r1ne opened 9 months ago

flu0r1ne commented 9 months ago

Recently, I examined this project to address two bugs related to Linux-specific features, namely packet marking and routing-policy-compliant address selection (see issue #485). While I've implemented two fixes — one for capability handling and another for packet marking in mtr address selection on Linux. The latter is not yet comprehensive. There remain scenarios where the address selection logic may fail and that's what I am addressing here. Although I can't allocate the time to fully implement this right now, I'd like to document my findings to give other contributors a head start.

Modern Linux systems offer robust routing capabilities, capable of directing traffic based on various properties such as packet mark, protocol, source port, destination port, uid (usable with user namespaces), TOS (Type of Service), and input interface. This complexity can disrupt address selection logic in older applications that rely on IP, UDP, or TCP sockets to identify the source IP address. To maintain compatibility with modern Linux routing, mtr needs to accurately emulate the exact request made to the kernel to route the packet.

For instance, mtr currently fails to properly route according to protocol rules:

sudo ip rule add from all ipproto icmp table 99
mtr -t 1.1.1.1 # Fails to consult table 99

I identify three potential strategies for correct implementation:

  1. Socket Probing: The existing strategy in mtr involves setting up a socket to probe the current IP address. While this may be effective on other operating systems like NetBSD, FreeBSD, and Solaris, the Linux implementation would need to mirror the kernel's sending options as closely as possible. This could include setting the appropriate protocol options (IPPROTO_ICMP, IPPROTO_UDP, IPPROTO_TCP), as well as configuring SO_MARK, IP_TOS, and SO_BINDTODEVICE. The source and destination ports should also be set correctly for stream protocols.

  2. AF_NETLINK Interface: Utilizing the RTM_GETROUTE interface within AF_NETLINK could be an effective method for identifying the source address on Linux. This API is specifically designed for fetching the IP address of an interface, and it allows explicit specification of attributes like the input interface, packet mark, TOS value, and others.

  3. eBPF: Another possibility involves sending the packet without explicitly specifying the address (unless provided by the user via command line). The actual address used could then be identified through eBPF (Extended Berkeley Packet Filter).

Among these options, I recommend investigating strategies 2 and 3. While option 1 may be familiar to systems engineers due to its use of well-known APIs, option 2 offers more transparent control flow. Option 3 could simplify implementation and maintenance by requiring only one set of routing logic. Moreover, if the routing logic were ever to go awry, the reported IP address would still be accurate, as it would be sourced directly from the kernel.

Currently, the packet routing logic resides in two places within the codebase. The net_reopen function in net.c is responsible for identifying the source address. When neither an address nor an interface is specified, this function calls net_find_local_address for the to identify the source address. A separate address resolution logic exists in mtr-packet's resolve_probe_addresses within probe.c. When the address isn't explicitly defined in mtr, the task of address resolution is handed over to mtr-packet. However, to the best of my knowledge, mtr-packet never actually resolves the source address, making this code seemingly redundant.

https://github.com/traviscross/mtr/tree/62bd718e83697dcb9b346b13478da37e52b81d10/ui/net.c#L685-L734

https://github.com/traviscross/mtr/tree/62bd718e83697dcb9b346b13478da37e52b81d10/packet/probe.c#L314-L376

For optimal implementation, I propose relocating the address selection logic entirely to mtr-packet for several reasons:

  1. It would accommodate scenarios where the route is modified while mtr is running, as the address would be resolved with each probe sent.
  2. The route selection could be unit-tested more conveniently, given that the mtr-packet command interface permits straightforward testing.
  3. It would eliminate the need for dual maintenance of address resolution logic, making mtr-packet a robust, stand-alone utility.

As the source address in mtr is both displayed in the UI and provided to mtr-packet for probe routing, transferring the routing logic to mtr-packet necessitates a mechanism for feeding the IP address information back to mtr. One viable approach is to extend the send-probe feature in mtr-packet with an option to report the IP address.

Consider introducing a report-local-ip argument to mtr-packet. When set to true, this would append the local IP address used for routing the packet to all send-probe responses. This change would be backward-compatible and could be parsed by mtr to display the address in the UI.

0 send-probe ip-4 1.1.1.1 report-local-ip true
0 reply ip-4 1.1.1.1 round-trip-time 43817 local-ip-4 10.68.178.11
1 send-probe ip-6 2606:4700:4700::1111 report-local-ip true
1 reply ip-6 2606:4700:4700::1111 round-trip-time 26304 local-ip-6 fc00:bbbb:bbbb:bb01::4:a736
2 send-probe ip-4 1.1.1.1 report-local-ip true
2 no-reply local-ip-4 10.68.178.11
3 send-probe ip-4 1.1.1.1 report-local-ip true
3 ttl-expired 1.1.1.1 local-ip-4 10.68.178.11
4 send-probe ip-4 1.1.1.1 report-local-ip false
4 reply ip-4 1.1.1.1 round-trip-time 43817

One downside to this approach is that mtr would need to await the response before displaying the source IP. However, this trade-off seems reasonable in the broader context.

Irrespective of the chosen implementation strategy, extensive testing of this routing feature is imperative. The netem library I authored should serve as a solid foundation for programmatic testing, although it may require minor adjustments to accommodate more complex rules.

rewolff commented 9 months ago

Wouldn't it be better to just have mtr-packet report the local IP whenever it changes? Both mtr and mtr-packet start out by setting the "last-local-ip" to -1 or 255.255.255.255 and mtr-packet sends a local-ip-changed message when it changes (either in response to a probe request or without a reference number) and mtr displays the -1 as "---" and changes that to the reported IP when it gets an update.

rewolff commented 9 months ago

For routing: I think mtr should actually USE the system routing tables as much as possible. We should try to avoid writing code to mimick the kernel's behaviour as much as possible.

Suppose the user has a routing problem that makes it look as if packets should go out on eth0 (which would work) but in fact they go out on eth1. User starts mtr to see what's going on and the "it looks as if" also fools mtr. So now mtr sends out probes on eth0 as well, hiding the actual problem.

Sooooo.... when I hear you talk about emulating the routing.... I really don't like that. It should be avoided as much as possible.

flu0r1ne commented 9 months ago

Regarding the initial comment, I appreciate the idea of utilizing asynchronous "ip-changed" notifications. A potential implementation could resemble the following:

config report-local-ip true
0 send-probe ip-4 1.1.1.1 report-local-ip true
0 local-ip-4 10.68.178.11
0 reply ip-4 1.1.1.1 round-trip-time 43817

The advantages of this approach are twofold:

  1. mtr-packet can report the local IP without waiting for a complete Round-Trip Time (RTT).
  2. There's no need to append a local-ip field to every command response.

I initially refrained from suggesting this due to its departure from the one-to-one correspondence between messages and responses. Introducing last-local-ip seems to add complexity, especially given that mtr-packet makes no assumptions regarding the sequence of responses or replies. As the chosen local IP is intrinsically tied to the sent packet, explicitly associating each packet with a local IP appears to be the most straightforward policy.

flu0r1ne commented 9 months ago

Regarding the second comment, I completely agree. When I refer to "emulating the kernel's routing behavior," I mean "selecting the appropriate routing table in accordance with the kernel's route policy." One complicating factor is that properties of network traffic can influence which routing table gets selected. For instance, as demonstrated in the previous example, the protocol in use can prompt the kernel to opt for an alternative routing table.

Currently, when opening a UDP socket with a destination address, you are essentially announcing, "mtr-packet will send UDP traffic without a TOS value, without binding to an interface, and without a packet mark. What is the correct source IP to use in this situation?" Subsequently, these parameters may change during data transmission, leading the kernel to choose a different routing table. As a result, the IP address may be spoofed, often causing the peer server to remain unresponsive. In essence, the correct route within the routing table is selected, but this doesn't necessarily guarantee that the correct routing table itself will be chosen. This remains a core issue requiring further examination.

As for the recent modifications to the kernel routing policy, they are backward-compatible. If a user does not engage in policy-routing, the standard tables will be consulted. However, if the user does enable policy routing, they would reasonably expect their traffic to be routed according to that specific policy — something commonly leveraged in scenarios involving security or network virtualization.

rewolff commented 8 months ago

Our "wishlist" would be to submit the ICMP, UDP or TCP packet on its way out and then modify the TTL. That would be the ideal scenario. We've been doing it differently for ages, so I'm guessing that isn't possible.

From a software-engineering point-of-view: I really dislike trying to emulate the kernel in a bunch of situations. If the kernel gets "complicated" then the current machine is NOT the place to run mtr on.

If your local machine happens to be a "complicated endpoint that happens to have several interfaces" for different situations (as opposed to a router), then debugging that becomes difficult. But debugging that becomes impossible if we TRY to emulate the kernel and fail. So in this case I would prefer to fail early as opposed to getting a long way and failing later. And with "fail early" I mean: Just do the simple stuff that everybody understands, and not pretend to do the complicated stuff that we'll get wrong anyway.

Summary: I'll refuse patches that start reading kernel firewall and TOS rules.