tohojo / flent

The FLExible Network Tester.
https://flent.org
Other
431 stars 77 forks source link

-6 Option ignored for hosts with IPv4 and IPv6 address #241

Closed j-breyer closed 2 years ago

j-breyer commented 2 years ago

In my current setup, I have two hosts defined in /etc/hosts, both capable of communicating via IPv4 and IPv6. Configuration: mn.h1 uses 10.0.6.1 and a::601 mn.h2 uses 10.0.14.2 and a::e02

On mn.h2, netserver is listening on both addresses, and irtt server is started listening on all addresses.

Running sudo flent rrul -6 -p all_scaled --local-bind mn.h1 -l 60 -H mn.h2 -o rrul6.pdf -v yields different results as follows:

Both hosts have their IPv4 enabled: verbose output of flent is Error- and warning-free, however, the irtt server on mn.h2 shows:

[10.0.6.1:56747] [OpenClose] open-close connection
[10.0.6.1:34502] [NewConn] new connection, token=e6582ec0dc3b7161
[10.0.6.1:34942] [NewConn] new connection, token=5898db5f137d0b66
[10.0.6.1:42243] [NewConn] new connection, token=c412d2bb05eefd1d

Indicating that irtt is using the IPv4 despite the -6 argument.

Removing the IPv4 resolution for mn.h1: (partial) verbose output of flent:

which: Found irtt executable at /usr/bin/irtt
UDP RTT test: Cannot use irtt runner (Irtt connection check failed: Error: dial udp4 [a::601]:0->10.0.14.2:2112: address a::601: non-IPv4 address
). Using netperf UDP_RR

Removing the IPv4 from mn.h2, but keeping it on mn.h1: (partial) verbose output of flent:

which: Found irtt executable at /usr/bin/irtt
UDP RTT test: Cannot use irtt runner (Irtt connection check failed: Error: dial udp6 10.0.6.1:0->[a::e02]:2112: bind: invalid argument
). Using netperf UDP_RR

Removing both IPv4 entries for mn.h1 and mn.h2: Flent verbose without errors, irtt server shows:

[[a::601]:55765] [OpenClose] open-close connection
[[a::601]:37744] [NewConn] new connection, token=df67b3a6d5892659
[[a::601]:38572] [NewConn] new connection, token=8326d482ae57f712
[[a::601]:55775] [NewConn] new connection, token=366ddbe19c620d6f

which is what I would expect every time, since the -6 argument was passed.

I did not try this on too many tests, but it appears at least for rrul and rrul_cs8

tohojo commented 2 years ago

Hmm, could you please post a full debug log (from --log-file), and the output of this command (with both address families defined in your /etc/hosts):

python -c 'import socket; print(socket.getaddrinfo("mn.h1", None, socket.AF_UNSPEC, socket.SOCK_STREAM))'

heistp commented 2 years ago

Fwiw I tried irtt with a similar setup, using hosts defined in /etc/hosts with both IPv4 and IPv6 addresses:

10.72.0.230 apu2a_6 a::601 apu2a_6 10.72.0.233 apu2c_6 a::603 apu2c_6

All of the following worked as expected:

sysadmin@apu2a:~$ irtt client -n apu2c_6
[Connecting] connecting to apu2c_6
[10.72.0.233:2112] [Connected] connection established
[10.72.0.233:2112] [NoTest] skipping test at user request
sysadmin@apu2a:~$ irtt client -6 -n apu2c_6
[Connecting] connecting to apu2c_6
[[a::603]:2112] [Connected] connection established
[[a::603]:2112] [NoTest] skipping test at user request
sysadmin@apu2a:~$ irtt client --local=apu2a_6 -n apu2c_6
[Connecting] connecting to apu2c_6
[10.72.0.233:2112] [Connected] connection established
[10.72.0.233:2112] [NoTest] skipping test at user request
sysadmin@apu2a:~$ irtt client --local=apu2a_6 -6 -n apu2c_6
[Connecting] connecting to apu2c_6
[[a::603]:2112] [Connected] connection established
[[a::603]:2112] [NoTest] skipping test at user request

Do those commands work for you outside of flent from your mn.h1 to mn.h2? I also wonder what happens if you don't use --local-bind.

Lastly, I realized we'll have to bracket the local bind address as well if it's an IPv6 literal, but that's not affecting you here because you're using a hostname.

j-breyer commented 2 years ago

Hmm, could you please post a full debug log (from --log-file),

Running the exact same command as posted with --log-file appended and scrolling a bit through the file, I found the following lines interesting:

2021-10-01 08:51:02,954 [flent.runners] DEBUG: Forked /usr/bin/netperf as pid 415309
2021-10-01 08:51:02,956 [flent.runners] DEBUG: TCP download EF: Starting watchdog with timeout 75
2021-10-01 08:51:02,957 [flent.runners] DEBUG: Started TimerRunner idx 9 ('Watchdog [TCP download EF]')
2021-10-01 08:51:02,958 [flent.runners] DEBUG: Started NetperfDemoRunner idx 9 ('TCP download EF')
2021-10-01 08:51:02,959 [flent.runners] DEBUG: Forking to run command /usr/bin/irtt client -o - --fill=rand -Q -d 70s -i 0.2s  --dscp=0xb8 --local=mn.h1  mn.h2
2021-10-01 08:51:02,963 [flent.runners] DEBUG: Forked /usr/bin/irtt as pid 415312
2021-10-01 08:51:02,967 [flent.runners] DEBUG: Started IrttRunner idx 13 ('Ping (ms) UDP EF :: child 0')
2021-10-01 08:51:02,968 [flent.runners] DEBUG: Forking to run command /usr/bin/irtt client -o - --fill=rand -Q -d 70s -i 0.2s  --dscp=0x20 --local=mn.h1  mn.h2
2021-10-01 08:51:02,972 [flent.runners] DEBUG: Forked /usr/bin/irtt as pid 415314
2021-10-01 08:51:02,975 [flent.runners] DEBUG: Started IrttRunner idx 14 ('Ping (ms) UDP BK :: child 0')
2021-10-01 08:51:02,976 [flent.runners] DEBUG: Forking to run command /usr/bin/irtt client -o - --fill=rand -Q -d 70s -i 0.2s   --local=mn.h1  mn.h2
2021-10-01 08:51:02,980 [flent.runners] DEBUG: Forked /usr/bin/irtt as pid 415323
2021-10-01 08:51:02,983 [flent.runners] DEBUG: Started IrttRunner idx 15 ('Ping (ms) UDP BE :: child 0')
2021-10-01 08:51:02,983 [flent.runners] DEBUG: Forking to run command /usr/bin/fping6  -D -p 200 -c 350 -t 140000  -I mn.h1 mn.h2

It seems like the -6 argument is not passed to the irtt command. See the full log file attached: rrul-2021-10-01T085102.606290.log

output of this command (with both address families defined in your /etc/hosts):

[(10, 1, 6, '', ('a::601', 0, 0, 0)), (2, 1, 6, '', ('10.0.6.1', 0))]

Do those commands work for you outside of flent from your mn.h1 to mn.h2?

Yes, with no problems at all, getting the respective same output as posted by you.

I also wonder what happens if you don't use --local-bind.

irtt connection no longer fails, as there's no mismatch in address families. However, as long as mn.h2 has an IPv4 defined, it uses IPv4 to connect, disregarding the -6 param

tohojo commented 2 years ago

j-breyer @.***> writes:

I also wonder what happens if you don't use --local-bind.

irtt connection no longer fails, as there's no mismatch in address families. However, as long as mn.h2 has an IPv4 defined, it uses IPv4 to connect, disregarding the -6 param

Hmm, I guess maybe the Go runtime has its own idea as to what constitutes the right way to lookup a hostname, then? I'll add the -6 arg for irtt, then...

heistp commented 2 years ago

I never noticed this, actually. The standard net.Dial(), for "historical reasons", just prefers IPv4 instead of relying on the local resolver. This prompted a "Wow, okay" from B. Fitzpatrick. (https://github.com/golang/go/issues/20911).

It looks like one can get around that by using the Lookup* methods first, but that is a hassle that will require some experimentation to make sure the behavior is right, e.g. with and without cgo and for unspecified listener addresses etc, for something that should ideally "just work".

Anyway, thanks for working around that. :/