openthread / ot-br-posix

OpenThread Border Router, a Thread border router for POSIX-based platforms.
https://openthread.io/
BSD 3-Clause "New" or "Revised" License
420 stars 235 forks source link

Ot-ctl's ping doesn't seem to receive all the ping replies sent to it #530

Closed JaneFromSilabs closed 4 years ago

JaneFromSilabs commented 4 years ago

Describe the bug A clear and concise description of what the bug is. While I am able to see the pings making it to the raspi, I don't see the response showing up in the ot-ctl interactive interface. I am able to ping if I extend the path by using the soc's ping instead. This is a little difficult to explain so bear with me. It seems to me as though ot-ctl isn't showing it received a ping response even though it should have received one. Note that I can receive these ping responses without issue if I use the raspi's ping instead.

To Reproduce Information to reproduce the behavior, including: Enter the interactive mode of the otbr and ping an address that lives out past the ethernet interface.

  1. Git commit id - commit a69a9d5c82d25b6d3c6ddd5ac29d80676b34465f
  2. IEEE 802.15.4 hardware platform - BRD4161A (Silabs)
  3. Build steps - As documented
  4. Network topology - 4 node network with 2 OTBR and 2 SoC running ot-cli-ftd in the following configuration:

[soc1] o)) ((o [raspi1/otbr1] <------ethernet-----> [raspi2/otbr2] o)) ((o [soc2]

Expected behavior A clear and concise description of what you expected to happen. I expect ot-ctl to show that I've received a valid ping response

Console/log output If applicable, add console/log output to help explain your problem. So here are a couple of working pings from the raspi1's ping:

pi@rpi1:~ $ ping 2001:db8:1:ffff:0:0:ac10:1c9
PING 2001:db8:1:ffff:0:0:ac10:1c9(2001:db8:1:ffff::ac10:1c9) 56 data bytes
64 bytes from 2001:db8:1:ffff::ac10:1c9: icmp_seq=1 ttl=63 time=0.422 ms
64 bytes from 2001:db8:1:ffff::ac10:1c9: icmp_seq=2 ttl=63 time=0.345 ms
^C
--- 2001:db8:1:ffff:0:0:ac10:1c9 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1062ms
rtt min/avg/max/mdev = 0.345/0.383/0.422/0.043 ms
pi@rpi1:~ $ ping 2001:33:0:0:e1e:ce85:c34e:21db
PING 2001:33:0:0:e1e:ce85:c34e:21db(2001:33::e1e:ce85:c34e:21db) 56 data bytes
64 bytes from 2001:33::e1e:ce85:c34e:21db: icmp_seq=1 ttl=63 time=54.2 ms
^C
--- 2001:33:0:0:e1e:ce85:c34e:21db ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 54.232/54.232/54.232/0.000 ms

I'm sending a ping from the ot-ctl interface of a border router to the global ip of the soc that is behind another BR. So from raspi1, I'm doing this:

pi@rpi1:~ $ sudo ot-ctl
\>
\> ping 2001:33:0:0:e1e:ce85:c34e:21db`

This ping makes its merry way to the eth0 of raspi1 and is forwarded along (by an added static route) to raspi2. raspi2 sees the packet as well as the response, which it send to raspi1. raspi1 receives the packet from the eth0 interface, but it seems to just get consumed there so that the ot-ctl interface never shows it received a response. If I trace the wpan0 of raspi1, I only see the req. If I trace the eth0 of raspi1, I see both req and resp

raspi1 routing table:

pi@rpi1:~ $ ip -6 route show
2001:22::/64 dev wpan0 proto kernel metric 256  pref medium
2001:33::/64 via fe80::ba27:ebff:fec3:751e dev eth0 metric 1024  pref medium
2001:db8:1:ffff::/96 dev nat64 metric 1024  pref medium
fd00:df4:6000::/64 dev wpan0 proto kernel metric 256  pref medium
fdaa:bb:1::2 dev nat64 proto kernel metric 256  pref medium
fdde:ad00:dead:beef::/64 dev wpan0 proto kernel metric 256  pref medium
fdde:ad03::/64 dev wpan0 proto kernel metric 256  pref medium
fe80::/64 dev eth0 proto kernel metric 256  pref medium
fe80::/64 dev nat64 proto kernel metric 256  pref medium
fe80::/64 dev wpan0 proto kernel metric 256  pref medium

Here is the eth0 when I'm pinging from the ot-ctl prompt:

21:51:00.053404 IP6 (hlim 63, next-header ICMPv6 (58) payload length: 16) 2001:22::a215:6044:acc9:2a02 > 2001:33::e1e:ce85:c34e:21db: [icmp6 sum ok] ICMP6, echo request, seq 48
        0x0000:  b827 ebc3 751e b827 ebc2 c264 86dd 6000  .'..u..'...d..`.
        0x0010:  0000 0010 3a3f 2001 0022 0000 0000 a215  ....:?..."......
        0x0020:  6044 acc9 2a02 2001 0033 0000 0000 0e1e  `D..*....3......
        0x0030:  ce85 c34e 21db 8000 63e1 0030 0030 0187  ...N!...c..0.0..
        0x0040:  3ea2 0000 0000                           >.....
21:51:00.094731 IP6 (hlim 63, next-header ICMPv6 (58) payload length: 16) 2001:33::e1e:ce85:c34e:21db > 2001:22::a215:6044:acc9:2a02: [icmp6 sum ok] ICMP6, echo reply, seq 48
        0x0000:  b827 ebc2 c264 b827 ebc3 751e 86dd 6000  .'...d.'..u...`.
        0x0010:  0000 0010 3a3f 2001 0033 0000 0000 0e1e  ....:?...3......
        0x0020:  ce85 c34e 21db 2001 0022 0000 0000 a215  ...N!...."......
        0x0030:  6044 acc9 2a02 8100 62e1 0030 0030 0187  `D..*...b..0.0..
        0x0040:  3ea2 0000 0000                           >.....

The ping from raspi1 thru raspi2 to soc2 works just fine as long as I'm using the raspi ping and not the ping from ot-ctl.

pi@rpi1:~ $ ping 2001:33:0:0:e1e:ce85:c34e:21db PING 2001:33:0:0:e1e:ce85:c34e:21db(2001:33::e1e:ce85:c34e:21db) 56 data bytes 64 bytes from 2001:33::e1e:ce85:c34e:21db: icmp_seq=1 ttl=63 time=52.9 ms 64 bytes from 2001:33::e1e:ce85:c34e:21db: icmp_seq=2 ttl=63 time=52.6 ms 64 bytes from 2001:33::e1e:ce85:c34e:21db: icmp_seq=3 ttl=63 time=46.7 ms

13:05:27.980038 IP6 (flowlabel 0x33ac9, hlim 64, next-header ICMPv6 (58) payload length: 64) 2001:22::a215:6044:acc9:2a02 > 2001:33::e1e:ce85:c34e:21db: [icmp6 sum ok] ICMP6, echo request, seq 3
        0x0000:  b827 ebc3 751e b827 ebc2 c264 86dd 6003  .'..u..'...d..`.
        0x0010:  3ac9 0040 3a40 2001 0022 0000 0000 a215  :..@:@..."......
        0x0020:  6044 acc9 2a02 2001 0033 0000 0000 0e1e  `D..*....3......
        0x0030:  ce85 c34e 21db 8000 116f 11bb 0003 87b7  ...N!....o......
        0x0040:  225f ddf3 0e00 0809 0a0b 0c0d 0e0f 1011  "_..............
        0x0050:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0060:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0070:  3233 3435 3637                           234567
13:05:28.026685 IP6 (hlim 63, next-header ICMPv6 (58) payload length: 64) 2001:33::e1e:ce85:c34e:21db > 2001:22::a215:6044:acc9:2a02: [icmp6 sum ok] ICMP6, echo reply, seq 3
        0x0000:  b827 ebc2 c264 b827 ebc3 751e 86dd 6000  .'...d.'..u...`.
        0x0010:  0000 0040 3a3f 2001 0033 0000 0000 0e1e  ...@:?...3......
        0x0020:  ce85 c34e 21db 2001 0022 0000 0000 a215  ...N!...."......
        0x0030:  6044 acc9 2a02 8100 106f 11bb 0003 87b7  `D..*....o......
        0x0040:  225f ddf3 0e00 0809 0a0b 0c0d 0e0f 1011  "_..............
        0x0050:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0060:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0070:  3233 3435 3637                           234567

And here is the reverse scenario, ping from soc2 to raspi1:

\> ping  2001:22::a215:6044:acc9:2a02
Done
\>
\> 16 bytes from 2001:22:0:0:a215:6044:acc9:2a02: icmp_seq=101 hlim=63 time=40ms

The same is true of NAT64: from raspi1: > ping 2001:db8:1:ffff:0:0:ac10:1ca Done > ping 2001:db8:1:ffff:0:0:ac10:1c9 Done

from soc1: > ping 2001:db8:1:ffff:0:0:ac10:1c9 Done > > 16 bytes from 2001:db8:1:ffff:0:0:ac10:1c9: icmp_seq=62 hlim=62 time=42ms

Here is raspi1's information: pi@rpi1:~ $ ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether b8:27:eb:c2:c2:64 brd ff:ff:ff:ff:ff:ff inet 172.16.1.201/24 brd 172.16.1.255 scope global eth0 valid_lft forever preferred_lft forever inet 172.16.1.63/24 brd 172.16.1.255 scope global secondary eth0 valid_lft forever preferred_lft forever inet6 fe80::ba27:ebff:fec2:c264/64 scope link valid_lft forever preferred_lft forever 3: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000 link/ether 46:cb:bd:c9:94:26 brd ff:ff:ff:ff:ff:ff 4: wpan0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500 link/void inet6 2001:22::a215:6044:acc9:2a02/64 scope global valid_lft forever preferred_lft forever inet6 fdde:ad03::c7ed:e129:1bf1:1eb4/64 scope global valid_lft forever preferred_lft forever inet6 fdde:ad00:dead:beef:1b89:882c:1450:b79b/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:6000::ff:fe00:fc30/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:6000::ff:fe00:fc00/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:6000::ff:fe00:1800/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:6000:0:e4f1:f97f:ef39:9382/64 scope global valid_lft forever preferred_lft forever inet6 fe80::7003:587c:5fca:11af/64 scope link valid_lft forever preferred_lft forever 5: nat64: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 500 link/none inet 192.168.255.1/32 scope global nat64 valid_lft forever preferred_lft forever inet6 fdaa:bb:1::2/128 scope global valid_lft forever preferred_lft forever inet6 fe80::8904:9c1c:9192:3e74/64 scope link flags 800 valid_lft forever preferred_lft forever

Here is raspi2: pi@rpi2:~ $ ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether b8:27:eb:c3:75:1e brd ff:ff:ff:ff:ff:ff inet 172.16.1.202/24 brd 172.16.1.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::ba27:ebff:fec3:751e/64 scope link valid_lft forever preferred_lft forever 3: wlan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000 link/ether 52:41:3b:f5:b0:80 brd ff:ff:ff:ff:ff:ff 4: wpan0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500 link/void inet6 2001:33::5a91:ea23:4ae0:64ff/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:7000::ff:fe00:fc33/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:7000::ff:fe00:fc00/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:7000::ff:fe00:b400/64 scope global valid_lft forever preferred_lft forever inet6 fd00:df4:7000:0:b3ef:7128:d90c:9fe5/64 scope global valid_lft forever preferred_lft forever inet6 fe80::f02d:cbe6:b731:275b/64 scope link valid_lft forever preferred_lft forever 5: nat64: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 500 link/none inet 192.168.255.1/32 scope global nat64 valid_lft forever preferred_lft forever inet6 fdaa:bb:1::2/128 scope global valid_lft forever preferred_lft forever inet6 fe80::3795:178b:499b:4cb3/64 scope link flags 800 valid_lft forever preferred_lft forever

soc1: > ipaddr 2001:22:0:0:a215:6044:acc9:2a02 fdde:ad03:0:0:c7ed:e129:1bf1:1eb4 fdde:ad00:dead:beef:1b89:882c:1450:b79b fd00:df4:6000:0:0:ff:fe00:fc30 fd00:df4:6000:0:0:ff:fe00:fc00 fd00:df4:6000:0:0:ff:fe00:1800 fd00:df4:6000:0:e4f1:f97f:ef39:9382 fe80:0:0:0:7003:587c:5fca:11af Done

soc2 > ipaddr fd00:df4:7000:0:0:ff:fe00:ec00 2001:33:0:0:e1e:ce85:c34e:21db fd00:df4:7000:0:5f5f:3b15:d17f:54d4 fe80:0:0:0:1c3f:f822:afba:c210 Done

gjc13 commented 4 years ago

This is the expected behavior. The kernel won't forward those packets to otbr-agent if it's sent to the interface's own address.

When the kernel receives an ICMP packet sent to 2001:22::a215:6044:acc9:2a02, it finds this address belongs to itself. Then it would simply ask for user-space applications to handle it (for instance ping). Only when there is a need to further forward this packet will the kernel send it to the underlying tun driver (packets for soc1). Then otbr-agent will receive this packet.

JaneFromSilabs commented 4 years ago

Thank you for the response, @gjc13.

I see why this is problematic now. From this low level analysis this is expected, but I think it might not be expected behavior from a user perspective. However, since the goal of this application is not to provide ping services, I completely understand the choice not to address the issue. I'll close this out.