openthread / ot-br-posix

OpenThread Border Router, a Thread border router for POSIX-based platforms.
https://openthread.io/
BSD 3-Clause "New" or "Revised" License
398 stars 225 forks source link

Thread Network is unreachable - OTBR running in docker container #1789

Closed saramonteiro closed 1 year ago

saramonteiro commented 1 year ago

Describe the bug A clear and concise description of what the bug is.

I followed the instructions from https://openthread.io/guides/border-router/docker/run to build an OTBR in a docker container using Raspberry Pi 4 running Raspberry OS Lite 64 bits. My RCP is a NXP Dongle based on K32061 MCU.

I used the following command to create the container:

docker run --sysctl "net.ipv6.conf.all.disable_ipv6=0 net.ipv4.conf.all.forwarding=1 net.ipv6.conf.all.forwarding=1" -p 8080:80 --dns=127.0.0.1 -it --volume /dev/ttyACM0:/dev/ttyACM0 --privileged openthread/otbr --radio-url spinel+hdlc+uart:///dev/ttyACM0 (only did the necessary changes for my USB port and baudrate)

After creating the container and setting up the Thread Network (forming and starting it) I have the Advertising logs as you can see in the next snippet:

Here is a snippet of the logs from the docker container running OTBR (It looks like the advertisement is ok) ``` Mar 14 09:50:10 9a5213ad09fa otbr-agent[191]: 00:06:49.335 [I] BorderRouter--: Discovering infraif NAT64 prefix Mar 14 09:50:10 9a5213ad09fa otbr-agent[191]: 00:06:49.335 [I] BorderRouter--: NAT64 prefix timer scheduled in 300 seconds Mar 14 09:50:10 9a5213ad09fa otbr-agent[191]: 00:06:49.336 [I] Platform------: Handling host address response for ipv4only.arpa Mar 14 09:50:10 9a5213ad09fa otbr-agent[191]: 00:06:49.336 [I] BorderRouter--: Infraif NAT64 prefix: none Mar 14 09:50:10 9a5213ad09fa otbr-agent[191]: 00:06:49.337 [I] BorderRouter--: Start evaluating routing policy, scheduled in 3354 milliseconds Mar 14 09:50:13 9a5213ad09fa otbr-agent[191]: 00:06:52.691 [I] BorderRouter--: Evaluating routing policy Mar 14 09:50:13 9a5213ad09fa otbr-agent[191]: 00:06:52.691 [I] BorderRouter--: Evaluating NAT64 prefix Mar 14 09:50:13 9a5213ad09fa otbr-agent[191]: 00:06:52.692 [I] BorderRouter--: RouterAdvert: Added PIO for fd77:1cb6:583c:ff1::/64 (valid=1800, preferred=1800) Mar 14 09:50:13 9a5213ad09fa otbr-agent[191]: 00:06:52.692 [I] BorderRouter--: RouterAdvert: Added RIO for fd3a:60df:275a:1::/64 (lifetime=1800) Mar 14 09:50:13 9a5213ad09fa otbr-agent[191]: 00:06:52.693 [I] BorderRouter--: Sent Router Advertisement on infra netif 5 Mar 14 09:50:13 9a5213ad09fa otbr-agent[191]: 00:06:52.693 [I] BorderRouter--: Start evaluating routing policy, scheduled in 302467 milliseconds Mar 14 09:50:13 9a5213ad09fa otbr-agent[191]: 00:06:52.693 [I] BorderRouter--: Received Router Advertisement from fe80:0:0:0:42:acff:fe11:2 on infra netif 5 Mar 14 09:50:23 9a5213ad09fa otbr-agent[191]: 00:07:02.402 [I] Mle-----------: Send Advertisement (ff02:0:0:0:0:0:0:1) Mar 14 09:50:23 9a5213ad09fa otbr-agent[191]: 00:07:02.415 [I] MeshForwarder-: Sent IPv6 UDP msg, len:90, chksum:d09c, ecn:no, to:0xffff, sec:no, prio:net Mar 14 09:50:23 9a5213ad09fa otbr-agent[191]: 00:07:02.415 [I] MeshForwarder-: src:[fe80:0:0:0:a8ef:5ce9:de34:9a06]:19788 Mar 14 09:50:23 9a5213ad09fa otbr-agent[191]: 00:07:02.415 [I] MeshForwarder-: dst:[ff02:0:0:0:0:0:0:1]:19788 Mar 14 09:50:50 9a5213ad09fa otbr-agent[191]: 00:07:29.132 [I] Mle-----------: Send Advertisement (ff02:0:0:0:0:0:0:1) Mar 14 09:50:50 9a5213ad09fa otbr-agent[191]: 00:07:29.144 [I] MeshForwarder-: Sent IPv6 UDP msg, len:90, chksum:e0c8, ecn:no, to:0xffff, sec:no, prio:net Mar 14 09:50:50 9a5213ad09fa otbr-agent[191]: 00:07:29.144 [I] MeshForwarder-: src:[fe80:0:0:0:a8ef:5ce9:de34:9a06]:19788 Mar 14 09:50:50 9a5213ad09fa otbr-agent[191]: 00:07:29.144 [I] MeshForwarder-: dst:[ff02:0:0:0:0:0:0:1]:19788 ```

But I can't ping from host machine (Raspberry) to OMR IP address from Thread Network. Actually I suspected there was something wrong with my OTBR because I had issues when trying to commission using chip-tool (from Matter), it was raising the Thread Network is unreachable error, and then when I tried to ping I realized the IPv6 network from Thread was unreachable.

To Reproduce Information to reproduce the behavior, including:

Just follow the standard tutorial using the setup described above.

Expected behavior A clear and concise description of what you expected to happen.

I expected to be able to ping from host machine to the OMR address of the Thread Network (wpan) inside the docker Container.

Console/log output If applicable, add console/log output to help explain your problem.

Here is the route table from OTBR ``` # ip -6 route show fd19:c442:63c5:fe3b::/64 dev wpan0 proto kernel metric 256 pref medium fd3a:60df:275a:1::/64 dev wpan0 proto kernel metric 256 pref medium fd77:1cb6:583c:ff1::/64 dev eth0 proto kernel metric 256 expires 1736sec pref medium fe80::/64 dev eth0 proto kernel metric 256 pref medium fe80::/64 dev wpan0 proto kernel metric 256 pref medium ```
Here is the route table from the host (Raspberry Running the docker container and chip-tool) ``` pi@raspberrypi:~ $ ip -6 route show ::1 dev lo proto kernel metric 256 pref medium fe80::/64 dev eth0 proto kernel metric 256 pref medium fe80::/64 dev wlan0 proto kernel metric 256 pref medium fe80::/64 dev veth48467df proto kernel metric 256 pref medium fe80::/64 dev docker0 proto kernel metric 256 pref medium ```
Here is the ot-ctl network details from OTBR ``` # sudo ot-ctl netdata show Prefixes: fd3a:60df:275a:1::/64 paos low 4c00 Routes: fd3a:60df:275a:2:0:0::/96 sn low 4c00 fd77:1cb6:583c:ff1::/64 s med 4c00 Services: 44970 01 6a000500000e10 s 4c00 44970 5d fd19c44263c5fe3b0d5432a3ea0c525dd11f s 4c00 Done # sudo ot-ctl ipaddr fd19:c442:63c5:fe3b:0:ff:fe00:fc11 fd3a:60df:275a:1:377a:bd47:532d:a383 fd19:c442:63c5:fe3b:0:ff:fe00:fc10 fd19:c442:63c5:fe3b:0:ff:fe00:fc38 fd19:c442:63c5:fe3b:0:ff:fe00:fc00 fd19:c442:63c5:fe3b:0:ff:fe00:4c00 fd19:c442:63c5:fe3b:d54:32a3:ea0c:525d fe80:0:0:0:a8ef:5ce9:de34:9a06 Done ```

Additional context Add any other context about the problem here.

One thing that I noticed later on the logs from OTBR when initializing it and I am not sure if it has some influence : ``` + ip6tables -C FORWARD -o wpan0 -j OTBR_FORWARD_INGRESS ip6tables v1.6.1: Couldn't load target `OTBR_FORWARD_INGRESS':No such file or directory Try `ip6tables -h' or 'ip6tables --help' for more information. + ip6tables -L OTBR_FORWARD_INGRESS ip6tables: No chain/target/match by that name. + ipset_destroy_if_exist otbr-ingress-deny-src + ipset list otbr-ingress-deny-src ipset v6.34: Kernel support protocol versions 6-7 while userspace supports protocol versions 6-6 The set with the given name does not exist + ipset_destroy_if_exist otbr-ingress-deny-src-swap + ipset list otbr-ingress-deny-src-swap ipset v6.34: Kernel support protocol versions 6-7 while userspace supports protocol versions 6-6 The set with the given name does not exist + ipset_destroy_if_exist otbr-ingress-allow-dst + ipset list otbr-ingress-allow-dst ipset v6.34: Kernel support protocol versions 6-7 while userspace supports protocol versions 6-6 The set with the given name does not exist + ipset_destroy_if_exist otbr-ingress-allow-dst-swap + ipset list otbr-ingress-allow-dst-swap ipset v6.34: Kernel support protocol versions 6-7 while userspace supports protocol versions 6-6 The set with the given name does not exist + ipset create -exist otbr-ingress-deny-src hash:net family inet6 + ipset create -exist otbr-ingress-deny-src-swap hash:net family inet6 + ipset create -exist otbr-ingress-allow-dst hash:net family inet6 + ipset create -exist otbr-ingress-allow-dst-swap hash:net family inet6 + ip6tables -N OTBR_FORWARD_INGRESS + ip6tables -I FORWARD 1 -o wpan0 -j OTBR_FORWARD_INGRESS + ip6tables -A OTBR_FORWARD_INGRESS -m pkttype --pkt-type unicast -i wpan0 -j DROP + ip6tables -A OTBR_FORWARD_INGRESS -m set --match-set otbr-ingress-deny-src src -j DROP + ip6tables -A OTBR_FORWARD_INGRESS -m set --match-set otbr-ingress-allow-dst dst -j ACCEPT + ip6tables -A OTBR_FORWARD_INGRESS -m pkttype --pkt-type unicast -j DROP + ip6tables -A OTBR_FORWARD_INGRESS -j ACCEPT ```

Questions that I have for now

  1. Should I need to do some extra network configuration to Raspberry host or docker container?
  2. How could we check where exactly is the issue? If it is on Router side or host side? (How to check if the RAs are arriving properly in the docker interface? How to check if Raspberry is receiving it or potentially ignoring or blocking it?)
caipiblack commented 1 year ago

Hi

Just a recap:

I reproduce the problem and I can confirm that with your command the router advertisement packets doesn't pass the docker container.

But I found a way to make it working, the idea is to force docker to use the host networking.

  1. Execute this on the computer running the docker:
sudo sysctl -w net.ipv6.conf.all.disable_ipv6=0
sudo sysctl -w net.ipv4.conf.all.forwarding=1 
sudo sysctl -w net.ipv6.conf.all.forwarding=1
  1. Execute the docker command with the option --net=host like this, you will have access to the network interfaces of the computer from the container:

    docker run --net=host --dns=127.0.0.1 -it --volume /dev/ttyACM0:/dev/ttyACM0 --privileged openthread/otbr --radio-url spinel+hdlc+uart:///dev/ttyACM0 -B enp3s0

    note 1: Replace enp3s0 by the name of your interface on the computer running otbr docker, ex: wlan0 note 2: when using this commands, some docker parameters are not required anymore (like -p and --sysctl)

  2. Start the thread network by going on the webpage of the computer running otbr.

  3. The router advertisements are now transmitted on the network

I don't know if it's the best option but like this the router advertisements are now crossing the local network when using otbr docker image.

caipiblack commented 1 year ago

Here are some RA packets from otbr docker:

Frame 5116: 86 bytes on wire (688 bits), 86 bytes captured (688 bits) on interface enp3s0, id 0
Ethernet II, Src: Giga-Byt_d5:18:27 (40:8d:5c:d5:18:27), Dst: IPv6mcast_01 (33:33:00:00:00:01)
Internet Protocol Version 6, Src: fe80::428d:5cff:fed5:1827, Dst: ff02::1
Internet Control Message Protocol v6
    Type: Router Advertisement (134)
    Code: 0
    Checksum: 0x6958 [correct]
    [Checksum Status: Good]
    Cur hop limit: 0
    Flags: 0x00, Prf (Default Router Preference): Medium
        0... .... = Managed address configuration: Not set
        .0.. .... = Other configuration: Not set
        ..0. .... = Home Agent: Not set
        ...0 0... = Prf (Default Router Preference): Medium (0)
        .... .0.. = Proxy: Not set
        .... ..0. = Reserved: 0
    Router lifetime (s): 0
    Reachable time (ms): 0
    Retrans timer (ms): 0
    ICMPv6 Option (Route Information : Medium fd11:22::/64)
        Type: Route Information (24)
        Length: 2 (16 bytes)
        Prefix Length: 64
        Flag: 0x00, Route Preference: Medium
        Route Lifetime: 1800
        Prefix: fd11:22::
Frame 14122: 118 bytes on wire (944 bits), 118 bytes captured (944 bits) on interface enp3s0, id 0
Ethernet II, Src: Giga-Byt_d5:18:27 (40:8d:5c:d5:18:27), Dst: IPv6mcast_01 (33:33:00:00:00:01)
Internet Protocol Version 6, Src: fe80::428d:5cff:fed5:1827, Dst: ff02::1
Internet Control Message Protocol v6
    Type: Router Advertisement (134)
    Code: 0
    Checksum: 0xd5fc [correct]
    [Checksum Status: Good]
    Cur hop limit: 0
    Flags: 0x00, Prf (Default Router Preference): Medium
        0... .... = Managed address configuration: Not set
        .0.. .... = Other configuration: Not set
        ..0. .... = Home Agent: Not set
        ...0 0... = Prf (Default Router Preference): Medium (0)
        .... .0.. = Proxy: Not set
        .... ..0. = Reserved: 0
    Router lifetime (s): 0
    Reachable time (ms): 0
    Retrans timer (ms): 0
    ICMPv6 Option (Prefix information : fd11:1111:1122:2222::/64)
        Type: Prefix information (3)
        Length: 4 (32 bytes)
        Prefix Length: 64
        Flag: 0xc0, On-link flag(L), Autonomous address-configuration flag(A)
        Valid Lifetime: 1800
        Preferred Lifetime: 1800
        Reserved
        Prefix: fd11:1111:1122:2222::
    ICMPv6 Option (Route Information : Medium fd11:22::/64)
        Type: Route Information (24)
        Length: 2 (16 bytes)
        Prefix Length: 64
        Flag: 0x00, Route Preference: Medium
        Route Lifetime: 1800
        Prefix: fd11:22::

My setup:

[gigabyte brix running otbr docker] --------- [ Ethernet switch ] -------[ Computer ]

icmp-ra are correctly received from the computer.

saramonteiro commented 1 year ago

@caipiblack Thanks! It also worked to me and I could properly set matter to commission the device and send commands via chip-tool 😃 I just noticed a log from the commissioning phase: [1678965251.244586][2293:2298] CHIP:DIS: Warning: Attempt to mDNS broadcast failed on docker0: ../../examples/chip-tool/third_party/connectedhomeip/src/inet/UDPEndPointImplSockets.cpp:409: OS Error 0x02000065: Network is unreachable looks like some different issue with service announcement, but didn`t prevent me to test the network. BTW, these commands are from wireshark or some command? Could you share the command if it is the case? I could have this output using using tcpdump:

$ sudo tcpdump -n -i wlp0s20f3 icmp6 and ip6[40] == 134
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlp0s20f3, link-type EN10MB (Ethernet), capture size 262144 bytes
12:30:50.202243 IP6 fe80::ab75:d680:128a:966b > ff02::1: ICMP6, router advertisement, length 64
12:31:01.053321 IP6 fe80::ab75:d680:128a:966b > ff02::1: ICMP6, router advertisement, length 64
12:32:04.168062 IP6 fe80::e695:6eff:fe43:10ea > ff02::1: ICMP6, router advertisement, length 120
12:35:50.168540 IP6 fe80::e695:6eff:fe43:10ea > ff02::1: ICMP6, router advertisement, length 120
12:36:01.805651 IP6 fe80::ab75:d680:128a:966b > ff02::1: ICMP6, router advertisement, length 64
Vyrastas commented 1 year ago

Hi there, thanks for bringing this up. The doc you referenced (https://openthread.io/guides/border-router/docker/run) is actually open-sourced here:

https://github.com/openthread/ot-docs/blob/main/site/en/guides/border-router/docker/run.md

If that needs to be updated, @saramonteiro would you be able to update it so that it works based on your findings? Once updated there we'll import the changes to openthread.io.

caipiblack commented 1 year ago

@saramonteiro : The warnings that you show are not very important: The matter stack send mDNS request on each interfaces. The warnings are just to notify about a failure to transmit the packet on some interfaces. (It can be for multiple reasons, failure to transmit multicast, ipv6 etc) in your case you don’t care about a failure on docker0 interface.

Yes my logs comes from wireshark.

saramonteiro commented 1 year ago

@Vyrastas done at https://github.com/openthread/ot-docs/pull/113! @caipiblack thanks for the support! Closing issue.