openthread / ot-br-posix

OpenThread Border Router, a Thread border router for POSIX-based platforms.
https://openthread.io/
BSD 3-Clause "New" or "Revised" License
420 stars 232 forks source link

1.2 DUA routing with docker containers #1053

Closed suveshpratapa closed 2 years ago

suveshpratapa commented 3 years ago

When we set up two OTBRs from scratch (non-docker) and configure them as BBRs for two different meshes, we are able to demonstrate 1.2 DUA connectivity between the two meshes.

However, we cannot demonstrate the same with two OTBR docker containers. Can someone please review our setup and provide some advice?

On both BBRs:

sudo docker network create --ipv6 --subnet fd11:db8:1::/64 -o com.docker.network.bridge.name=otbr0 otbr-network

docker run -d --name "otbr" --network otbr-network --sysctl "net.ipv6.conf.all.disable_ipv6=0 net.ipv4.conf.all.forwarding=1 net.ipv6.conf.all.forwarding=1" -p 8080:80 --dns=127.0.0.1 -it --volume /dev/ttyACM0:/dev/ttyACM0 -v /dev/bus/usb:/dev/bus/usb --privileged otbr-test:latest --radio-url spinel+hdlc+uart:///dev/ttyACM0

This is a test setup, so we are manually configuring routes:

BBR1: sudo ip -6 route add fd00:7d03:7d03:7d03::/64 via <ll-addr-of-BBR2-eth0> dev eth0

BBR2: sudo ip -6 route add fd00:7d03:7d03:7d03::/64 via <ll-addr-of-BBR1-eth0> dev eth0

Doing the above was enough for the non-docker case, but clearly we're missing some additional configuration in the docker case. How do we inform the docker bridge network about the routing rules for the DUA prefix? Is there additional docker configuration we could be missing?

suveshpratapa commented 3 years ago

@jwhui I'm working with @JaneFromSilabs on testing 1.2 DUA with docker containers, and we're stuck, so looking for advice in this area. We suspect it's a routing rule complication we're missing with docker containers, but can't get a clear answer online.

simonlingoogle commented 3 years ago

One hint is that the DUA prefix fd00:7d03:7d03:7d03::/64 should not be added as routes.

Instead, fd00:7d03:7d03:7d03::/64 should be advertised as a on-link prefix, usually by radvd.

For example, we are using radvd in GitHub Actions to test Thread 1.2 DUA features:

# cat /etc/radvd.conf
interface eth0
{
    AdvSendAdvert on;

    MinRtrAdvInterval 3;
    MaxRtrAdvInterval 30;
    AdvDefaultPreference low;

    prefix fd00:7d03:7d03:7d03::/64
    {
        AdvOnLink on;
        AdvAutonomous off;
        AdvRouterAddr off;
    };
};

Please verify if radvd can solve the issue. Thanks. @suveshpratapa

suveshpratapa commented 3 years ago

@simonlingoogle Hi, thank you for that suggestion. While this seems fine in theory for an OTBR we install directly, it doesn't seem to work with an OTBR running inside a docker container.

Is this a configuration you have tried out before?

Some info:

simonlingoogle commented 3 years ago

Our OTBR container is configured with a custom virtual bridge otbr0. We don't pass --net=host for our containers.

We are using the same architecture for Backbone Router tests in GitHub Actions (backbone-router job in otbr.yml). So, I think the configuration works:

The configuration works well in Github Actions (e.g. https://github.com/openthread/openthread/runs/3965146582?check_suite_focus=true)

We pass REFERENCE_DEVICE when building our container, so it is installing radvd (https://github.com/openthread/ot-br-posix/blob/main/script/bootstrap#L89), however, we don't see it running.

REFERENCE_DEVICE=1 will install radvd, however it's not automatically restarted. The radvd is only used by Host devices in Backbone Router tests for advertising Domain Prefix.

JaneFromSilabs commented 3 years ago

@simonlingoogle Thanks for the information. I have a couple more questions about your configuration if you don't mind.

  1. When you are setting up your otbr to run inside a docker container, are you configuring radvd to operate on the eth0, backbone0 or both?
  2. Did you need to add any static routes at all?
simonlingoogle commented 3 years ago

@simonlingoogle Thanks for the information. I have a couple more questions about your configuration if you don't mind.

  1. When you are setting up your otbr to run inside a docker container, are you configuring radvd to operate on the eth0, backbone0 or both?

Note that backbone0 is eth0 in OTBR and Host because they are using backbone0 as the network. So, radvd in Host should use eth0 as the advertising interface.

  1. Did you need to add any static routes at all?

No. We have DUA routing manager in ot-br-posix that sets up policy routes for DUA prefix automatically.

https://github.com/openthread/ot-br-posix/blob/a6a95abb5c187931124a087e9578b7448ca8c95f/src/backbone_router/dua_routing_manager.cpp#L70-L74

https://github.com/openthread/ot-br-posix/blob/a6a95abb5c187931124a087e9578b7448ca8c95f/src/backbone_router/dua_routing_manager.cpp#L82-L88

suveshpratapa commented 3 years ago

@simonlingoogle

Sorry, but we're still a little confused on how this should work. It works fine with our BBRs set up manually, but not in docker containers. We strongly suspect it's a networking configuration issue with docker, but we're hitting dead-ends with everything we tried. Please take a look at the following configuration:

Follow are the steps we follow on two different raspberry pis (our BBRs). The steps are the same on both, except for the Thread networks that we create at the end. They will use the same DUA prefix on the backbone interface, but are on a different PAN / use different network keys to keep the meshes isolated for testing.

  1. radvd setup and running, advertising DUA prefix fd00:7d03:7d03:7d03::/64 on eth0, which is our backbone interface for testing (radvd.conf set up as above, and service running)

  2. Created docker bridge for testing: docker network create --ipv6 --driver=bridge --subnet fd11:db8:1::/64 -o com.docker.network.bridge.name=otbr0 otbr-network

  3. Started default 1.2 BBR docker container using this bridge:

    docker run -d --name "otbr" \
        --sysctl "net.ipv6.conf.all.disable_ipv6=0 net.ipv4.conf.all.forwarding=1 net.ipv6.conf.all.forwarding=1" \
        --network otbr-network \
        -p 8080:80 --dns=127.0.0.1 -it \
        --volume /dev/ttyACM0:/dev/ttyACM0 \
        --privileged otbr:latest \
        --radio-url spinel+hdlc+uart:///dev/ttyACM0 \
        --backbone-interface eth0
  4. Inspect the IPv6 address assigned to our running container:

    docker network inspect otbr-network | grep IPv6Address
    "IPv6Address": "fd11:db8:1::2/64"
  5. Using address in step 4, set up docker container route for packets with DUA prefix (fd00:7d03:7d03:7d03::/64): sudo ip -6 route add fd00:7d03:7d03:7d03::/64 via fd11:db8:1::2 dev otbr0 proto static metric 1

  6. On OTBR container, create a network advertising a DUA prefix:

    channel 11
    panid 0xFACE
    networkkey 00112233445566778899aabbccddeeff
    prefix add fd00:7d03:7d03:7d03::/64 pasroD med
    ifconfig up
    thread start

After these steps, we cannot demonstrate connectivity between different meshes using DUA addresses.

We tried configuring radvd to run inside the container, set up rules for the docker bridge otbr0, etc, to no avail.

simonlingoogle commented 3 years ago

I suspect that the issue is due to a wrong docker image.

The otbr:latest image does not have complete Thread 1.2 features because it's basically for testing Thread 1.3 features. You can check if otbr:reference-device image can solve your issue.

  1. Using address in step 4, set up docker container route for packets with DUA prefix (fd00:7d03:7d03:7d03::/64): sudo ip -6 route add fd00:7d03:7d03:7d03::/64 via fd11:db8:1::2 dev otbr0 proto static metric 1

Normally we don't need to add routes for DUA because routes should be configured automatically. BTW, I think the network interface within Docker should be eth0?

suveshpratapa commented 3 years ago

@simonlingoogle

The otbr:latest image does not have complete Thread 1.2 features

Just FYI, what I wrote above was a sample command. I'm actually using an OTBR docker container that I built with 1.2 DUA features (including OTBR_DUA_ROUTING and OT_DUA) turned on. This is the container that we're trying to test for 1.2 BBR communication.

BTW, I think the network interface within Docker should be eth0?

My OTBR container is using a docker bridge network otbr0 (created in step 2 above). So I have to perform step 5 (route add) to inform my host that packets with that DUA prefix should be routed to the container.

Next steps:

You mentioned you don't use --net=host for testing your docker containers, correct? If that option is not specified (like in our case), the containers use a default docker bridge, which we're overriding with a custom bridge https://docs.docker.com/network/network-tutorial-standalone

Unfortunately been a frustrating ordeal trying to prove that this can be made to work with docker containers. So any pointers in the right direction would help.

simonlingoogle commented 3 years ago

My OTBR container is using a docker bridge network otbr0 (created in step 2 above). So I have to perform step 5 (route add) to inform my host that packets with that DUA prefix should be routed to the container.

I don't think this is necessary. OTBR Docker should setup routes correctly. You can check it by:

$ ip -6 route list
$ ip -6 route list table openthread

You should run another Docker container on the same Docker network, which runs radvd and advertises the DUA prefix.

Unfortunately been a frustrating ordeal trying to prove that this can be made to work with docker containers. So any pointers in the right direction would help.

Since we are running DUA tests in otbr.yml/backbone-router job using OTBR Dockers, I think it should work if you have the exact same settings.

suveshpratapa commented 3 years ago

@simonlingoogle Thank you! We're going to take a step back and revisit our container setup. As I said, we are able to test that this works on a barebones install, so we're also hoping it's a configuration misstep.

jwhui commented 2 years ago

Closing stale issue.