networkop / cx

Containerised Cumulus VX
19 stars 4 forks source link

host container does not apply ip during start #1

Closed 7oku closed 3 years ago

7oku commented 3 years ago

Hi,

I'm trying to deploy the small sample lab with sw1 and h1 from https://github.com/srl-labs/containerlab/blob/master/lab-examples/cvx02/topo.clab.yml.

The host does not set the ip address at start.

7oku@cumuluslab:~/cumulus$ sudo containerlab deploy --topo topo.yml
INFO[0000] Parsing & checking topology file: topo.yml
INFO[0000] Creating lab directory: /home/7oku/cumulus/clab-lab
INFO[0000] Creating docker network: Name='clab', IPv4Subnet='172.20.20.0/24', IPv6Subnet='2001:172:20:20::/64', MTU='1500'
INFO[0000] Creating container: sw1
INFO[0000] Creating container: h1
INFO[0014] Creating virtual wire: sw1:swp12 <--> h1:eth1
INFO[0015] Writing /etc/hosts file
+---+--------------+--------------+-------------------------+-------+-------+---------+----------------+----------------------+
| # |     Name     | Container ID |          Image          | Kind  | Group |  State  |  IPv4 Address  |     IPv6 Address     |
+---+--------------+--------------+-------------------------+-------+-------+---------+----------------+----------------------+
| 1 | clab-lab-h1  | 004028e93f34 | networkop/host:ifreload | linux |       | running | 172.20.20.3/24 | 2001:172:20:20::3/64 |
| 2 | clab-lab-sw1 | 24e81f33d650 | networkop/cx:4.4.0      | cvx   |       | running | 172.20.20.2/24 | 2001:172:20:20::2/64 |
+---+--------------+--------------+-------------------------+-------+-------+---------+----------------+----------------------+
7oku@cumuluslab:~/cumulus$ docker exec -ti 004 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
36: eth0@if37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:14:14:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.20.20.3/24 brd 172.20.20.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2001:172:20:20::3/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe14:1403/64 scope link
       valid_lft forever preferred_lft forever
38: eth1@if39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP group default
    link/ether aa:c1:ab:b8:e3:d2 brd ff:ff:ff:ff:ff:ff link-netnsid 1

After executing the entrypoint again, the ip address is set perfectly:

7oku@cumuluslab:~/cumulus$ docker exec -ti 004 /entrypoint.sh
Waiting for 1 interfaces to be connected
Connected all interfaces
warning: syslogs: [Errno 2] No such file or directory
warning: netlink: ip link show: netlink: cannot get ifname for index 37: operation failed with 'No such device' (19)
warning: netlink: ip link show: netlink: cannot get ifname for index 39: operation failed with 'No such device' (19)
7oku@cumuluslab:~/cumulus$ docker exec -ti 004 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
36: eth0@if37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:14:14:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.20.20.3/24 brd 172.20.20.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2001:172:20:20::3/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe14:1403/64 scope link
       valid_lft forever preferred_lft forever
38: eth1@if39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether aa:c1:ab:b8:e3:d2 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet 12.12.12.2/24 scope global eth1
       valid_lft forever preferred_lft forever

I'm not sure exactly what the issue is, but i assume entrypoint.sh waits for the amount of interfaces (one) which is fulfilled by eth0 already, so it is not waiting for eth1 to be connected before ifreload is triggered.

7oku commented 3 years ago

I can confirm omitting eth0 in the for loop works as expected.

int_calc ()
{
    index=0
    for i in $(ls -1v /sys/class/net/ | grep 'eth\|ens\|eno' | grep -v eth0); do
      let index=index+1
    done

    MYINT=$index
}

I would consider eth0 to be always available and configured as a mgmt interface (like in cumulus linux as well), reachable through the docker net. So would there be any reason not ignoring it in the startup check?

networkop commented 3 years ago

Ah, I see. I think I know why this happens. Have a look at the entrypoint. This image accepts an optional integer argument telling it how many interfaces should it expect to be connected before doing ifreload -a. See this example.

So I guess if you add an extra cmd: 2 field to the topology definition , it should work. Can you confirm?

This is why simply ignoring eth0 wouldn't work, because the default for that argument is 1 which means one extra interface.

I think I just need to update the topology definition.

7oku commented 3 years ago

Awesome, that alternative approach works, as long as you keep the integer in sync with the actual configured interfaces.

Would it be a good idea to have the loop in entrypoint.sh run over the /etc/network/interfaces as well to determine the expected interfaces automatically?

Not a big deal, just trying to remove the manual intervention at a second place.

networkop commented 3 years ago

great! I think you could, potentially look into e/n/i but that would assume that they all are listed there. i.e. you can't define a subset because you're inevitable gonna miss some number of interfaces this way.

another option is to create a new kind for host:ifreload and, since clab knows exactly how many interfaces are expected to be connected, it can pass this number in the cmd automatically. Should be a fairly small PR.

wdyt about either option?

7oku commented 3 years ago

in the first option you could turn it the other way saying each interface in e/n/i needs to be counted in the cmd argument, so if you miss one, it's gonna fail as well.

i'm new to containerlab (coming from CTD, looking for a docker approach and found your webpage) so i cannot estimate if a new kind in containerlab is worth it, however i just found there are already other linux kinds without the need for specifying the interface count explicitely. to me it seems that would be the best fit doing it the same way and go with option 2.

networkop commented 3 years ago

ok, so PR in containerlab is merged and two latest commits c8e000c and e6ab064 combine your suggestion with containerlab's metadata. So now the logic is as follows:

networkop commented 3 years ago

tested and it seems to be working. so I'll close the issue for now, feel free to re-open if you still see it.