hellt / vrnetlab

Make VM-based Network OSes run in Containerlab
https://containerlab.dev
MIT License
129 stars 88 forks source link

Usage of num_nics attribute #273

Closed edvgui closed 3 weeks ago

edvgui commented 4 weeks ago

Today I tried deploying a cisco csr router with 10 data plane interfaces. Everything went well, except for the 10th interface (GigabitEthernet11) which wasn't present in the virtual device after it finished booting.

After some digging, I discovered that this piece of code decided that eth10 was not relevant because it was over self.num_nics.
https://github.com/hellt/vrnetlab/blob/7f99b39edbf811ca498378fee4b4c9393f779458/common/vrnetlab.py#L353-L354

For cisco csr, num nics is always 9, it is the default value of the --nics option, which containerlab doesn't seem to be using (and most router kinds don't seem to have in their cli options) so it only ever gets its default value. https://github.com/hellt/vrnetlab/blob/7f99b39edbf811ca498378fee4b4c9393f779458/csr/docker/launch.py#L256

I tried changing this default value to 10, to see if the device would accept this additional interface, and to my surprise, it did! This is great. But it got me wondering: What is the purpose of the num_nics attribute?

  1. Is it supposed to represent a vendor limitation, about the maximum number of interfaces? Why is it a cli option for some device kinds then? And why is its current value under-evaluated for cisco csr?
  2. Or is it supposed to be a dynamic value, provided by containerlab (maybe something in sync with CLAB_INTFS?)?
  3. Or is it some legacy logic with doesn't need to be used anymore?

It looks like there is something to fix, I am willing to create a PR, but I am not sure what the fix should be, as I don't know the intention behind this attribute. Can I get some help here?

edvgui commented 4 weeks ago

I found a very nasty workaround to be able to use the --nics option with containerlab "out of the box":

diff --git a/files/cisco-csr-2R.clab.yml b/files/cisco-csr-2R.clab.yml
index e5d9769..abbce6d 100644
--- a/files/cisco-csr-2R.clab.yml
+++ b/files/cisco-csr-2R.clab.yml
@@ -13,6 +13,9 @@ topology:
         - "22830:830"
       mgmt-ipv4: 172.20.20.12
       startup-config: ./cisco-csr-evc-2R/east.cfg
+      env:
+        # Workaround https://github.com/hellt/vrnetlab/issues/273
+        USERNAME: admin --nics 32
     router-west:
       kind: vr-csr
       image: code.inmanta.com:4567/solutions/containers/vr-csr:17.03.04
@@ -22,6 +25,9 @@ topology:
         - "24830:830"
       mgmt-ipv4: 172.20.20.14
       startup-config: ./cisco-csr-evc-2R/west.cfg
+      env:
+        # Workaround https://github.com/hellt/vrnetlab/issues/273
+        USERNAME: admin --nics 32

I inject the --nics option inside the username which will then we converted into the command: https://github.com/srl-labs/containerlab/blob/5e9647beee1134c318f69b5dd24f376ff3604d06/nodes/vr_csr/vr-csr.go#L75-L76

$ docker inspect -f json  clab-cisco-csr-2R-router-east | jq ".[0].Config.Cmd"
[
  "--username",
  "admin",
  "--nics",
  "32",
  "--password",
  "admin",
  "--hostname",
  "router-east",
  "--connection-mode",
  "tc",
  "--trace"
]

But a proper fix is definitely preferable ^^

hellt commented 3 weeks ago

Hi. The num nics indeed is supposed to be set to a limit that the virtual product supports.

The reason it was set to 10 is likely because it was like this for older versions or it was a copy paste issue

As Cisco people don't quite govern their systems here, we rely on the community hive mind to maintain the proper settings

If anyone knows the current limits we can bump the default value

edvgui commented 3 weeks ago

I saw that the configuration seem to accept interface number going from 1 to 32.

router-east#show interfaces GigabitEthernet ?
  <1-32>  GigabitEthernet interface number

I then tried to configure that many interfaces, but it looks like the device will only setup the first 26 ones:

router-east#show interfaces summary

 *: interface is up
 IHQ: pkts in input hold queue     IQD: pkts dropped from input queue
 OHQ: pkts in output hold queue    OQD: pkts dropped from output queue
 RXBS: rx rate (bits/sec)          RXPS: rx rate (pkts/sec)
 TXBS: tx rate (bits/sec)          TXPS: tx rate (pkts/sec)
 TRTL: throttle count

  Interface                   IHQ       IQD       OHQ       OQD      RXBS      RXPS      TXBS      TXPS      TRTL
-----------------------------------------------------------------------------------------------------------------
* GigabitEthernet1              0         0         0         0      3000         3      3000         3         0
* GigabitEthernet2              0         0         0         0         0         0         0         0         0
* GigabitEthernet3              0         0         0         0         0         0         0         0         0
* GigabitEthernet4              0         0         0         0         0         0         0         0         0
  GigabitEthernet5              0         0         0         0         0         0         0         0         0
  GigabitEthernet6              0         0         0         0         0         0         0         0         0
  GigabitEthernet7              0         0         0         0         0         0         0         0         0
  GigabitEthernet8              0         0         0         0         0         0         0         0         0
  GigabitEthernet9              0         0         0         0         0         0         0         0         0
  GigabitEthernet10             0         0         0         0         0         0         0         0         0
* GigabitEthernet11             0         0         0         0         0         0         0         0         0
  GigabitEthernet12             0         0         0         0         0         0         0         0         0
  GigabitEthernet13             0         0         0         0         0         0         0         0         0
  GigabitEthernet14             0         0         0         0         0         0         0         0         0
  GigabitEthernet15             0         0         0         0         0         0         0         0         0
  GigabitEthernet16             0         0         0         0         0         0         0         0         0
  GigabitEthernet17             0         0         0         0         0         0         0         0         0
  GigabitEthernet18             0         0         0         0         0         0         0         0         0
  GigabitEthernet19             0         0         0         0         0         0         0         0         0
  GigabitEthernet20             0         0         0         0         0         0         0         0         0
  GigabitEthernet21             0         0         0         0         0         0         0         0         0
  GigabitEthernet22             0         0         0         0         0         0         0         0         0
  GigabitEthernet23             0         0         0         0         0         0         0         0         0
  GigabitEthernet24             0         0         0         0         0         0         0         0         0
  GigabitEthernet25             0         0         0         0         0         0         0         0         0
  GigabitEthernet26             0         0         0         0         0         0         0         0         0
* Loopback0                     0         0         0         0         0         0         0         0         0

So I guess that the default nics count can be set to 26 (or 25 as the first interface is used by eth0 here)?

At the same time, the router seem to be able to start, no matter how many interfaces we ask it to setup, it just drops the ones it can't handle. So does it make sense to use this hard-coded limit in vrnetlab? Maybe a later version of the router (I am currently using csr1000v-universalk9.17.03.04a, I have no idea what else may be out there) will be able to support more, it would be nice to not need to update vrnetlab again then. Wdyt?

hellt commented 3 weeks ago

I'm good with raising it to 32 as well. Let's do it