traviscross / mtr

Official repository for mtr, a network diagnostic tool
http://www.bitwizard.nl/mtr/
GNU General Public License v2.0
2.65k stars 338 forks source link

--address does not work #232

Closed totoCZ closed 1 week ago

totoCZ commented 6 years ago

Hi,

see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882331

[root@10g-frb-kvm-1-195-181-170-12.cdn77.com /root]# diff old new | grep 213.198 < bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("213.198.94.174")}, 16) = 0 < getsockname(3, {sa_family=AF_INET, sin_port=htons(255), sin_addr=inet_addr("213.198.94.174")}, [16]) = 0 < recvfrom(6, "E\0\0`x^\0\0\377\1\332Y\325\306^\251\325\306^\256\v\0\364\356\0\21\0\0E\0\0@"..., 4470, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("213.198.94.169")}, [16]) = 96 < write(1, "1. 213.198.94.169\33[6;240H", 25) = 25

The call to bind() is missing in 0.92.

old http://termbin.com/5dhs new http://termbin.com/kaao

Reproduced with latest git version.

totoCZ commented 6 years ago

Interestingly in TCP mode it works. The old one doesn't have any problems with either option.

rewolff commented 6 years ago

I have added an extra IP address to my workstation. I then ran mtr and the source address was correctly the default IP address and when I then added --address with the second address the source address was correctly changed. See also Robert's comment.

09:12:03.565781 IP 192.168.xxx.47 > 8.8.8.8: ICMP echo request, id 21268, seq 61824, length 44
09:12:03.677038 IP 192.168.xxx.47 > 8.8.8.8: ICMP echo request, id 21268, seq 62080, length 44
09:12:03.788359 IP 192.168.xxx.47 > 8.8.8.8: ICMP echo request, id 21268, seq 62336, length 44
09:12:03.899689 IP 192.168.xxx.47 > 8.8.8.8: ICMP echo request, id 21268, seq 62592, length 44
09:12:04.011114 IP 192.168.xxx.47 > 8.8.8.8: ICMP echo request, id 21268, seq 62848, length 44
09:12:04.122422 IP 192.168.xxx.47 > 8.8.8.8: ICMP echo request, id 21268, seq 63104, length 44
09:12:04.233754 IP 192.168.xxx.47 > 8.8.8.8: ICMP echo request, id 21268, seq 63360, length 44
...
09:12:47.822254 IP 192.168.xxx.234 > 8.8.8.8: ICMP echo request, id 24340, seq 59520, length 44
09:12:47.922464 IP 192.168.xxx.234 > 8.8.8.8: ICMP echo request, id 24340, seq 59776, length 44
09:12:48.022703 IP 192.168.xxx.234 > 8.8.8.8: ICMP echo request, id 24340, seq 60032, length 44
09:12:48.122871 IP 192.168.xxx.234 > 8.8.8.8: ICMP echo request, id 24340, seq 60288, length 44
09:12:48.223067 IP 192.168.xxx.234 > 8.8.8.8: ICMP echo request, id 24340, seq 60544, length 44
09:12:48.323256 IP 192.168.xxx.234 > 8.8.8.8: ICMP echo request, id 24340, seq 60800, length 44
09:12:48.423445 IP 192.168.xxx.234 > 8.8.8.8: ICMP echo request, id 24340, seq 61056, length 44
totoCZ commented 6 years ago
[root@10g-frb-kvm-1-195-181-170-12.cdn77.com /root]# tcpdump icmp -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
08:25:44.126987 IP 24.91.114.1 > 5.56.18.202: ICMP echo reply, id 61827, seq 64446, length 64
08:25:44.138795 IP 104.186.149.49 > 5.56.18.202: ICMP echo reply, id 61827, seq 64446, length 64
08:25:44.139696 IP 104.1.152.38 > 5.56.18.202: ICMP echo reply, id 61827, seq 64446, length 64
08:25:44.155285 IP 73.0.0.1 > 5.56.18.202: ICMP echo reply, id 61827, seq 64446, length 64
08:25:45.580667 IP 185.152.64.149 > 5.56.18.202: ICMP echo request, id 62948, seq 61619, length 24
08:25:45.580711 IP 5.56.18.202 > 185.152.64.149: ICMP echo reply, id 62948, seq 61619, length 24
08:25:46.000543 IP 5.56.18.202 > 188.62.39.158: ICMP echo request, id 21856, seq 31869, length 64
08:25:46.000631 IP 5.56.18.202 > 194.209.91.200: ICMP echo request, id 21856, seq 31869, length 64
08:25:46.000663 IP 5.56.18.202 > 90.182.205.65: ICMP echo request, id 21856, seq 31869, length 64
08:25:46.000691 IP 5.56.18.202 > 178.196.15.119: ICMP echo request, id 21856, seq 31869, length 64
08:25:46.000711 IP 5.56.18.202 > 138.223.69.2: ICMP echo request, id 21856, seq 31869, length 64
08:25:46.000726 IP 5.56.18.202 > 185.78.129.1: ICMP echo request, id 21856, seq 31869, length 64
08:25:46.000743 IP 5.56.18.202 > 90.176.79.44: ICMP echo request, id 21856, seq 31869, length 64

when mtr --report -a 213.198.94.174 ic.cz -c 1 is set.

[root@10g-frb-kvm-1-195-181-170-12.cdn77.com /root]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:30:48:de:f5:14 brd ff:ff:ff:ff:ff:ff
    inet 5.56.18.202/30 brd 5.56.18.203 scope global eth0
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br0 state UP group default qlen 1000
    link/ether 0c:c4:7a:bd:1d:50 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br2 state UP group default qlen 1000
    link/ether 00:30:48:de:f5:15 brd ff:ff:ff:ff:ff:ff
5: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 0c:c4:7a:bd:1d:50 brd ff:ff:ff:ff:ff:ff
    inet 195.181.170.12/28 brd 195.181.170.15 scope global br0
       valid_lft forever preferred_lft forever
6: br0.27@br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 0c:c4:7a:bd:1d:50 brd ff:ff:ff:ff:ff:ff
    inet 195.16.162.202/30 brd 195.16.162.203 scope global br0.27
       valid_lft forever preferred_lft forever
7: br0.20@br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 0c:c4:7a:bd:1d:50 brd ff:ff:ff:ff:ff:ff
    inet 37.77.42.7/31 brd 255.255.255.255 scope global br0.20
       valid_lft forever preferred_lft forever
8: br0.30@br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 0c:c4:7a:bd:1d:50 brd ff:ff:ff:ff:ff:ff
    inet 213.198.94.174/29 brd 213.198.94.175 scope global br0.30
       valid_lft forever preferred_lft forever
9: eth2.303@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UP group default
    link/ether 00:30:48:de:f5:15 brd ff:ff:ff:ff:ff:ff
10: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 00:30:48:de:f5:15 brd ff:ff:ff:ff:ff:ff
    inet 192.168.53.1/24 brd 192.168.53.255 scope global br1
       valid_lft forever preferred_lft forever
11: br2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 00:30:48:de:f5:15 brd ff:ff:ff:ff:ff:ff
12: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UNKNOWN group default qlen 500
    link/ether fe:54:00:9c:84:2a brd ff:ff:ff:ff:ff:ff
13: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br2 state UNKNOWN group default qlen 500
    link/ether fe:54:00:9c:84:32 brd ff:ff:ff:ff:ff:ff
14: vnet2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UNKNOWN group default qlen 500
    link/ether fe:54:00:eb:a5:1d brd ff:ff:ff:ff:ff:ff
15: vnet3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br1 state UNKNOWN group default qlen 500
    link/ether fe:54:00:f6:8e:4d brd ff:ff:ff:ff:ff:ff
[root@10g-frb-kvm-1-195-181-170-12.cdn77.com /root]#
rewolff commented 6 years ago

You have a very complex system.

MTR submits the packet to the kernel with: "Could you please send this onto the internet with source addrss 213.198.94.174". Then, as Robert explained the system takes over and routes it out the interface that it sees best. But also things like address translation take place. Because the 213 IP address is associated with the bridge (Why would that have an IP address?!) I think that it's likely that you also have network-address-translation going on to prevent those IP addresses from going out on the internet (eth0 interface).

totoCZ commented 6 years ago

So i guess no plans to make it behave like 0.87 where it just works? I guess we’ll have to stick with that one since this setup exists on many servers. Or force TCP explicitly for everything :-/ st 22. 11. 2017 v 10:07 odesílatel Roger Wolff notifications@github.com napsal:

You have a very complex system.

MTR submits the packet to the kernel with: "Could you please send this onto the internet with source addrss 213.198.94.174". Then, as Robert explained the system takes over and routes it out the interface that it sees best. But also things like address translation take place. Because the 213 IP address is associated with the bridge (Why would that have an IP address?!) I think that it's likely that you also have network-address-translation going on to prevent those IP addresses from going out on the internet (eth0 interface).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/traviscross/mtr/issues/232#issuecomment-346287407, or mute the thread https://github.com/notifications/unsubscribe-auth/AFEemiTQETQxkHDzKaxA_GkhiqoV396lks5s4-RFgaJpZM4QmHao .

-- Best, Tom Hetmer

totoCZ commented 6 years ago

Actually do you think it’d work with —interface or it’s the same logic? I didn’t try that yet as it’s not in the testing repo in deb. That could fix it for us if it forces the interface correctly.

// tested: does not work

st 22. 11. 2017 v 10:19 odesílatel Tomáš Hetmer tom@hetmer.com napsal:

So i guess no plans to make it behave like 0.87 where it just works? I guess we’ll have to stick with that one since this setup exists on many servers. Or force TCP explicitly for everything :-/ st 22. 11. 2017 v 10:07 odesílatel Roger Wolff notifications@github.com napsal:

You have a very complex system.

MTR submits the packet to the kernel with: "Could you please send this onto the internet with source addrss 213.198.94.174". Then, as Robert explained the system takes over and routes it out the interface that it sees best. But also things like address translation take place. Because the 213 IP address is associated with the bridge (Why would that have an IP address?!) I think that it's likely that you also have network-address-translation going on to prevent those IP addresses from going out on the internet (eth0 interface).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/traviscross/mtr/issues/232#issuecomment-346287407, or mute the thread https://github.com/notifications/unsubscribe-auth/AFEemiTQETQxkHDzKaxA_GkhiqoV396lks5s4-RFgaJpZM4QmHao .

-- Best, Tom Hetmer

-- Best, Tom Hetmer

totoCZ commented 6 years ago

btw, the reason why this setup is like that is because of KVM br0 bridges eth1, vnet0 and vnet2 (the main public interface + our VMs) on top of that interface we run tagged VLANs which need to have IP addresses set /each for different upstream provider for network path debugging/ there is no NAT used., it's probably true we could use ie. eth1.30, but the bridge setup seemed more straightforward at the time. nothing in debian docs says you can't have IPs on bridge. (https://wiki.debian.org/BridgeNetworkConnections#Configuring_bridging_in_.2Fetc.2Fnetwork.2Finterfaces)

rewolff commented 6 years ago

Well.... the thing is: on first examination: It works here, there seems something odd with your setup.

So you're running the 213.xxx IP address in a VM on 5.xxx machine, and the 213.xxx IP address is indeed routed to your 5.xxx machine? Correct?

Sometimes things are changed and they look good from the point-of-view of the intention of the patch, but in the end, there are unintended consequences.

Is there a simpler setup that I can reproduce to recreate your issue?

IF I can recreate a setup that triggers the behaviour, then I can run git bisect to find the commit that started all this. That is very helpful in finding what caused this. Or you can read up on git bisect yourself and tell me what commit it is. Once you get going git bisect is really fast.

totoCZ commented 6 years ago

Nope, the VMs are more or less unrelated here, they just use the bridge on this multi-purpose machine. It's a little confusing, I know.

[root@10g-frb-kvm-1-195-181-170-12.cdn77.com /root]# ip r default via 195.181.170.14 dev br0 # default = best path 5.56.18.200/30 dev eth0 proto kernel scope link src 5.56.18.202 # upstream 37.77.42.6/31 dev br0.20 proto kernel scope link src 37.77.42.7 # upstream 192.168.53.0/24 dev br1 proto kernel scope link src 192.168.53.1 # internal lan (unrelated) 195.16.162.200/30 dev br0.27 proto kernel scope link src 195.16.162.202 # upstream 195.181.170.0/28 dev br0 proto kernel scope link src 195.181.170.12 # br0 on eth1 (default) 213.198.94.168/29 dev br0.30 proto kernel scope link src 213.198.94.174 # upstream

99% of things (and the VMs) will just use br0 most of the time. Here we're interested in selecting a particular upstream ISP to traceroute from instead of the default route which is on our own network and always takes the best path. Hope that clears it up a little.

5d26cb0c0500b85f71a43194f090bb97064f71cf is the first bad commit

rewolff commented 6 years ago

So..... I still need you to point me to the exact commit that caused the difference in behaviour so that hopefully I can say: oh, yes, that was unintended. OR I need a way to reproduce it. So what in your setup causes this behaviour?

totoCZ commented 6 years ago

i did link https://github.com/traviscross/mtr/commit/5d26cb0c0500b85f71a43194f090bb97064f71cf :-)

rewolff commented 6 years ago

Ooops. Missed that. Sorry. This is a big patch, in principle the networking code should have just been moved from one place to another, but if the behaviour has changed,.... apparently not.

Have you succeeded in reproducing it with a less complex case than your complex server?

totoCZ commented 6 years ago

I've played a little bit with the code and it's very likely this line https://github.com/traviscross/mtr/commit/5d26cb0c0500b85f71a43194f090bb97064f71cf#diff-7e6cfcfb56933e6f4317cd89daec3f9eL780

https://sourcecodebrowser.com/mtr/0.80/net_8c.html#aebc241caf5aa720751c630bc422b5cee if ( bind( sendsock, sourcesockaddr, len ) == -1 ) {

if there's no explicit bind() anywhere it autodefaults to br0 which would explain the behavior. (and the missing bind() in the newer strace)

just having 2 interfaces, each one with a different IP should be enough to replicate it.

adampav commented 3 years ago

Hi,

First of all. @rewolff thank you for your efforts. I am having the same issues on a multihomed host with more than 2 interfaces on two different networks. i.e. despite i am setting the source address to a value from an interface in network 2, the ICMP probes are sent out from the interface that belongs in network 1. In contrast TCP probes behave normally. I also tried with mtr version 0.94 built from source and the --interface stuff. Still the same.

@TomHetmer IIUC your solution addresses this problem => having ICMP exiting the appropriate interface based on source address?

Thanks! A.

totoCZ commented 3 years ago

Yes in short the IPv4 code uses IPPROTO_RAW socket type and no bind() (that's not supported), IPv6 uses the IPPROTO_ICMP type with bind() My commit basically copy pasted the v6 socket code to v4. So it should work if you compile from my branch or just use the old 0.87

adampav commented 3 years ago

Thank you very much. will give it a shot.

idallen commented 3 years ago

mtr 0.94 (also mtr 0.93 under Ubuntu 20.04 LTS) Same problem for me; both --interface and --address do not work unless I also specify --tcp. My machine has three network interfaces. I can't get mtr to use anything other than the default. A tcpdump shows that the ICMP packets are incorrectly going out my default interface using the source address of the --interface or --address that I gave on the command line. If I add --tcp things work fine. p.s. traceroute works fine using its --interface or --source options. p.p.s. Tom Hetmer's version (of mtr-packet) fixes the problem.

rewolff commented 3 years ago

Tom, Didn't I accept your pull request?

alarig commented 3 years ago

Same issue here on 0.94, but the patch from @TomHetmer doesn’t seem to work anymore, I have compilation errors:

make[1]: Entering directory '/var/tmp/portage/net-analyzer/mtr-0.94-r1/work/mtr-0.94'
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.        -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o ui/mtr-asn.o `test -f 'ui/asn.c' || echo './'`ui/asn.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.        -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o ui/mtr-curses.o `test -f 'ui/curses.c' || echo './'`ui/curses.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/packet.o packet/packet.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/cmdparse.o packet/cmdparse.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/command.o packet/command.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/probe.o packet/probe.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/timeval.o packet/timeval.c
In file included from /usr/include/string.h:519,
                 from packet/probe.c:31:
In function ‘strncat’,
    inlined from ‘respond_to_probe’ at packet/probe.c:296:9:
/usr/include/bits/string_fortified.h:137:10: warning: ‘__builtin___strncat_chk’ output may be truncated copying between 0 and 4095 bytes from a string of length 4095 [-Wstringop-truncation]
  137 |   return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/sockaddr.o packet/sockaddr.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/construct_unix.o packet/construct_unix.c
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I.     -O2 -pipe -march=native -mtune=native -Wall -Wno-pointer-sign -c -o packet/deconstruct_unix.o packet/deconstruct_unix.c
packet/construct_unix.c: In function ‘construct_ip4_packet’:
packet/construct_unix.c:513:47: error: ‘const struct net_state_platform_t’ has no member named ‘icmp4_send_socket’; did you mean ‘icmp6_send_socket’?
  513 |             send_socket = net_state->platform.icmp4_send_socket;
      |                                               ^~~~~~~~~~~~~~~~~
      |                                               icmp6_send_socket
packet/construct_unix.c:524:47: error: ‘const struct net_state_platform_t’ has no member named ‘udp4_send_socket’; did you mean ‘udp6_send_socket’?
  524 |             send_socket = net_state->platform.udp4_send_socket;
      |                                               ^~~~~~~~~~~~~~~~
      |                                               udp6_send_socket
make[1]: *** [Makefile:919: packet/construct_unix.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory '/var/tmp/portage/net-analyzer/mtr-0.94-r1/work/mtr-0.94'
make: *** [Makefile:672: all] Error 2

Edit: My bad, I included the commit 666ca78e3ea73a57698f9649348c6926c6e12d96 and it works

Edit2: It fixes the issue here:

edge01-terrahost ~ # ip r g 85.167.125.99
85.167.125.99 via 10.0.4.21 dev gre4 src 10.0.4.20 uid 0 
    cache expires 554sec mtu 1476 
edge01-terrahost ~ # ip r g 85.167.125.99 from 185.181.60.143
85.167.125.99 from 185.181.60.143 via 185.181.60.1 dev eno1 table 56655 uid 0 
    cache 
edge01-terrahost ~ # mtr -bzwe -a 185.181.60.143 85.167.125.99
Start: 2020-11-14T22:07:17+0100
HOST: edge01-terrahost.no.swordarmor.fr                     Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS2119   ti0043a400-5712.bb.online.no (85.167.125.99)   0.0%    10   40.2  29.9  12.9  48.3  12.3
edge01-terrahost ~ # mtr -bzwe 85.167.125.99
Start: 2020-11-14T22:07:54+0100
HOST: edge01-terrahost.no.swordarmor.fr                     Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS2119   ti0043a400-5712.bb.online.no (85.167.125.99)   0.0%    10   50.4  50.9  45.2  56.2   3.1
edge01-terrahost ~ # genlop -t mtr
 * net-analyzer/mtr

     Mon Sep  9 14:16:28 2019 >>> net-analyzer/mtr-0.87
       merge time: 15 seconds.

     Sun Nov  8 00:16:36 2020 >>> net-analyzer/mtr-0.94
       merge time: 34 seconds.

edge01-terrahost ~ # emaint sync -r SwordArMor && eix-update 
# […]
edge01-terrahost ~ # emerge -vaA1 net-analyzer/mtr
# […]
# I installed the patched version here
edge01-terrahost ~ # mtr -bzwe -a 185.181.60.143 85.167.125.99
Start: 2020-11-14T22:42:53+0100
HOST: edge01-terrahost.no.swordarmor.fr                                  Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. AS56655  distro2.dc2.vl1515.k10.terrahost.no (185.181.60.1)          0.0%    10    0.4   0.6   0.4   1.0   0.2
  2. AS56655  mxcore2.dc2.xe-0-0-0-1.k10.terrahost.no (185.125.170.106)   0.0%    10    0.4   2.8   0.2  24.8   7.7
  3. AS56655  mxcore1.dc2.et-0-0-2.k10.terrahost.no (185.125.170.130)     0.0%    10    0.4   0.7   0.3   3.6   1.0
  4. AS29695  141.0.104.145.static.lyse.net (141.0.104.145)               0.0%    10    3.0   3.2   2.9   3.7   0.3
  5. AS29695  62.213-167-114.customer.lyse.net (213.167.114.62)           0.0%    10    3.5   3.3   3.1   3.5   0.1
  6. AS29695  56.81-166-123.customer.lyse.net (81.166.123.56)             0.0%    10    3.4   4.1   3.4   5.0   0.6
  7. AS29695  219.213-167-114.customer.lyse.net (213.167.114.219)         0.0%    10    2.7   2.8   2.6   3.0   0.1
  8. AS2119   ti0001b400-ae9-0.ti.telenor.net (148.122.8.53)              0.0%    10    6.9   5.9   2.5  12.6   3.5
  9. AS2119   ti0016c360-ae16-0.ti.telenor.net (146.172.105.50)           0.0%    10   61.8  60.9  46.3  69.7   7.4
 10. AS2119   ti0023c400-ae0-0.ti.telenor.net (146.172.20.234)            0.0%    10   58.7  54.8  37.9  72.8   9.9
 11. AS2119   ti0013c400-ae22-0.ti.telenor.net (146.172.21.61)            0.0%    10   64.4  59.3  40.6  81.5  13.2
 12. AS2119   ti0043a400-ae0-0.ti.telenor.net (146.172.102.162)           0.0%    10   65.1  65.8  45.5  85.8  13.1
 13. AS2119   ti0043a400-5712.bb.online.no (85.167.125.99)                0.0%    10   65.4  76.2  63.9  92.0   8.6
edge01-terrahost ~ # 
edge01-terrahost ~ # 
edge01-terrahost ~ # 
edge01-terrahost ~ # ip addr show gre4
24: gre4@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1476 qdisc noqueue state UNKNOWN group default qlen 1000
    link/gre 185.181.60.143 peer 85.167.125.99
    inet 10.0.4.20/31 scope global gre4
       valid_lft forever preferred_lft forever
    inet6 2a0e:f42:fffe:1::2e/127 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::200:5efe:b9b5:3c8f/64 scope link 
       valid_lft forever preferred_lft forever
edge01-terrahost ~ # ip addr show eno1
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 74:46:a0:90:81:4d brd ff:ff:ff:ff:ff:ff
    inet 185.181.60.143/24 brd 185.181.60.255 scope global eno1
       valid_lft forever preferred_lft forever
    inet6 2a03:94e0:ffff:185:181:60:0:143/118 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::7646:a0ff:fe90:814d/64 scope link 
       valid_lft forever preferred_lft forever
edge01-terrahost ~ # ip -4 ru sh
0:  from all lookup local
32765:  from 185.181.60.0/24 lookup 56655
32766:  from all lookup main
32767:  from all lookup default
acoul commented 3 years ago

Greetings,

I would like to confirm this issue on my 32-bit gentoo-rolling-current system connected to three different adsl providers on 192.168.(1,2,3).0/24 respectively

mtr-0.94 is unable to honor the -a option while mtr-0.87 works just fine

traceroute-2.1.0 also works fine using the -s option (https://sourceforge.net/projects/traceroute/files/traceroute/)

I would be happy to test any suggested code in order to help resolve this issue

Edit:

$ ip rule ls 0: from all lookup local 32760: from all to 192.168.2.0/24 lookup HOL 32761: from 192.168.2.0/24 lookup HOL 32762: from all to 192.168.1.66/24 lookup FORTH 32763: from 192.168.1.66/24 lookup FORTH 32764: from all to 192.168.3.66/24 lookup WIND 32765: from 192.168.3.66/24 lookup WIND 32766: from all lookup main 32767: from all lookup default

alarig commented 3 years ago

If you use gentoo, you can use the package from my overlay (SwordArMor), I included the commit fixing the issue.

acoul commented 3 years ago

hello alarig & thank you for the patch

I did test your patch against a vanilla mtr-0.94

compiling, I get a few "output may be truncated copying between ..." warnings but it does compile & most importantly it WORKS as it should, meaning I am now able to properly trace using the appropriate (-a) internal network for each of my three different upstream providers.

I have no noticeable issues with ./mtr upc.cz -e mentioned here: https://github.com/traviscross/mtr/pull/371#issuecomment-699670567 tested on two different servers with same provider, one with a vanilla mtr-0.94 & the other with a patched one

on ./mtr upc.cz -u indeed "multiple hosts on each line start jumping up and down instead of staying at the fixed place" which is something I can live with

other that the above, I don't see any other issues with the ipv4-sockets patch

idallen commented 3 years ago

Tom, Didn't I accept your pull request?

Is Tom Hetmer's fix incorporated into an official MTR release yet? I'm still running his patched version of mtr-packet and would like to run an official version.

acoul commented 2 years ago

Greetings,

I just made a fresh clone & a build of the mtr repository today and everything about the --address options looks fine. perhaps this issue may now rest as resolved ?

as a gentoo user I still need this patch which does not look destructive & perhaps may be also possibly included upstream

idallen commented 2 years ago

No, the issue is not resolved when compiled on my Ubuntu 20.04.3 Linux system. The Tom Hetmer version of mtr-packet works for both root and ordinary users, with or without using setcap or setuid, and the "fresh clone and build" version only works for normal users if it has no special privileges. The fresh version doesn't work for root and it doesn't work for anyone if given privileges with setcap.

idallen commented 2 years ago

I'm still hoping to see Tom Hetmer's fix to mtr-packet incorporated into an official MTR release.

bewing commented 2 years ago

acoul on ./mtr upc.cz -u indeed "multiple hosts on each line start jumping up and down instead of staying at the fixed place" which is something I can live with

429 should resolve this issue.

bottiger1 commented 1 year ago

Edit

Deleted previous statement, I believe rovo89's update fixes this issue.

It was not working for me because my compiled version of MTR was using the older version of mtr-packet installed by Ubuntu 22.04, I had to uninstall the OS version and now -I works.

As a suggestion maybe someone can add ip route and parse the results if -a was used but -I was not specified. Then you can automatically parse the correct interface from the output. But I don't know if every distro has ip route or if the output will be guaranteed to be the same format.

ip route get 1.1.1.1 from 1.2.3.4 1.1.1.1 from 1.2.3.4 dev tunnel1 table TEST uid 1000

@rewolff if you are ok with that then I could make a pull request that extracts the interface automatically like this if -a is specified but -I is not:

char command[256];
char interface[32];
snprintf(command, sizeof(command), "ip route get %s from %s | grep -oP 'dev (\K[^ ]+)'", dst, src);
FILE* file = popen(command, "r");
fgets(interface, sizeof(interface), file);
darless commented 1 year ago

I can confirm the issue is present in the latest commit: ab6f80fc8d8cd9ec05c0d997bd31098e9d93bf4d. I can confirm that downgrading to v0.87 also fixes the issue. I can also confirm that the code in the Gentoo overlay provided by @alarig fixes the issue for me. The patch in the overlay: https://github.com/gentoo-mirror/SwordArMor/blob/master/net-analyzer/mtr/files/mtr-0.94-ipv4-sockets.patch

From a regression point of view, adding an interface option does not directly fix the issue as passing in the interface was not required in v0.87, thus the patch for ipv4 sockets is the appropriate fix, where how we call it does not need to change.

@alarig Do you plan on creating a pull request with your fix, if not then do you mind if I do it so that it can be used in the official release?

rewolff commented 1 year ago

That patch removes a function called "byte_swap" I do remember that BSD-like OSes had something funny with the packet headers in raw mode that required different byte order than Linux.

As "arch Linux" is bound to run on a Linux kernel, simply removing the swap code will work for Linux, but a proper fix will leave the swap in place for BSD-like OSes that need it.

nickyang777 commented 1 year ago

I can confirm the issue is present in the latest commit: ab6f80f. I can confirm that downgrading to v0.87 also fixes the issue. I can also confirm that the code in the Gentoo overlay provided by @alarig fixes the issue for me. The patch in the overlay: https://github.com/gentoo-mirror/SwordArMor/blob/master/net-analyzer/mtr/files/mtr-0.94-ipv4-sockets.patch

From a regression point of view, adding an interface option does not directly fix the issue as passing in the interface was not required in v0.87, thus the patch for ipv4 sockets is the appropriate fix, where how we call it does not need to change.

@alarig Do you plan on creating a pull request with your fix, if not then do you mind if I do it so that it can be used in the official release?

@rewolff Do you plan to incorporate the alarig's fix into the official MTR release? It's very useful for me.

alarig commented 1 year ago

Sorry for the lag @darless, it wasn’t originally my code, but nobody opened the MR on their name so… to late for them :p

yvs2014 commented 1 year ago

That patch removes a function called "byte_swap" I do remember that BSD-like OSes had something funny with the packet headers in raw mode that required different byte order than Linux.

As "arch Linux" is bound to run on a Linux kernel, simply removing the swap code will work for Linux, but a proper fix will leave the swap in place for BSD-like OSes that need it.

true, with this commit udp-mode doesn't work on freebsd-13.2, openbsd-7.3, and works on netbsd-9.3 fbsd-13.2 for example:

% uname -sr
FreeBSD 13.2-RELEASE-p2

% git clone https://github.com/traviscross/mtr.git && cd mtr && ./bootstrap.sh && ./configure && make
...
% sudo ./mtr -u localhost; echo $?
...
mtr: Address not available
1

% git rev-parse HEAD
74d312d7e67d002e184b37c7f278597ab06bf8e7

% git checkout HEAD~2; make
% sudo ./mtr -u localhost; echo $?
...
0
rewolff commented 1 year ago

Yves, can you test:

diff --git a/packet/construct_unix.c b/packet/construct_unix.c
index e09d705..7285efa 100644
--- a/packet/construct_unix.c
+++ b/packet/construct_unix.c
@@ -207,7 +207,8 @@ int construct_udp4_packet(
     memset(udp, 0, sizeof(struct UDPHeader));

     set_udp_ports(udp, probe, param);
-    udp->length = htons(udp_size);
+    udp->length = net_state->platform.ip_length_host_order ?
+      udp_size:htons(udp_size);

     /* calculate udp checksum */
     struct UDPPseudoHeader udph = {

?

yvs2014 commented 1 year ago

made and run with this correction

% make
make  all-am
  CC       packet/construct_unix.o
packet/construct_unix.c:248:9: warning: unused variable 'udp_socket' [-Wunused-variable]
    int udp_socket = net_state->platform.udp6_send_socket;
        ^
1 warning generated.
  CCLD     mtr-packet

% sudo ./mtr -u localhost
...
mtr: Address not available

just in case:

% sudo ./mtr -u -4 localhost
...
mtr: Address not available
rewolff commented 1 year ago

Slightly "expected" as the error for wrong length is something like "invalid argument" and we're getting "address not available".

yvs2014 commented 1 year ago

I haven't looked into code what caused that error, can confirm that reverting it to HEAD~2 makes 'mtr -u' work

rewolff commented 1 year ago

I've installed a VM with freeBSD. (I have -RELEASE, not RELEASE-p2).

OK. There is a bind call failing. I've disabled the bind call and it works again. its on line 623 of packet/construct_unix.c

flu0r1ne commented 11 months ago

I believe some of the issues discussed in this thread are related to those I brought up in #485. I recently contributed a change to fix the -I option (see #487) and am currently investigating issues related to address binding. If this feature was malfunctioning on Linux, it might now be operational. However, the address selection logic still fails in certain scenarios. One effective workaround I've found is using -I in conjunction with -a (with a release after #487). This bypasses the problematic selection logic.

For example, to reach 1.1.1.1 over wlp5s0, which has the local address 192.168.1.221, you can use the following command:

MTR_PACKET="./mtr-packet" sudo -E ./mtr -t -I wlp5s0 -a 192.168.1.221 1.1.1.1
yvs2014 commented 11 months ago

I've installed a VM with freeBSD. (I have -RELEASE, not RELEASE-p2). OK. There is a bind call failing. I've disabled the bind call and it works again. its on line 623 of packet/construct_unix.c

not sure, if I'm not mistakken

% uname -sr; pwd; git log | head -1
FreeBSD 13.2-RELEASE-p3
~/tmp/mtr
commit 6e659b821ad3636f36e52fb1b97588a1d9966139

% sudo ./mtr -c1 -u 127.0.0.1; echo $?
...
mtr: Address not available
1
% git checkout ad48183 && make
% sudo ./mtr -c1 -u 127.0.0.1; echo $?
...
0
rewolff commented 11 months ago

I didn't mean to say I fixed it, I just localized the problem. A friend of mine used to "port" programs by just deleting lines-of-code that proved problematic. Either during compilation or throwing an error during runtime. Anyway I don't think that's the correct strategy.

So after localizing the problem I for sure hope I didn't just commit that work-in-progress.

totoCZ commented 1 week ago

closing 2017 reported 2020 patched mtr 2023 merged