srl-labs / containerlab

container-based networking labs
https://containerlab.dev
BSD 3-Clause "New" or "Revised" License
1.53k stars 262 forks source link

Declarative definition of a link delay / jitter / packet loss #1398

Open jbemmel opened 1 year ago

jbemmel commented 1 year ago

When emulating network topologies with varying paths, it could be useful to have clab configure variable delays/jitter/packetloss/etc.

    br-clab:
      kind: bridge
      latency: 5-10ms
      jitter: 100ms
      packet-loss: 1%
hellt commented 1 year ago

Would you be willing to try https://github.com/srl-labs/containerlab/issues/442 ?

It should be more powerful and flexible in what can be done. Pack it in a bash script that calls pumba after lab deployment and it can be what you're after

jbemmel commented 1 year ago

Too complicated imho - YAT (Yet Another Tool). I was more thinking of a simple 1-line TC invocation (as shown in that thread), feeding it the various parameters

Of course one could simply invoke tc manually - that's what I witnessed someone doing during a demo today. But what I'm after, is making it part of the lab definition (for WAN scenarios)

hellt commented 1 year ago

For reference A python project tcconfig to interface with tc - https://twitter.com/danieldibswe/status/1673584203802329089

https://tcconfig.readthedocs.io/en/latest/

steiler commented 1 year ago

https://pkg.go.dev/github.com/florianl/go-tc This works well. Have a test implemented, setting jitter delay and loss in a network namespace on a specific interface.

hellt commented 1 year ago

I am still not entirely sold on embedding tc control to clab definition. Most of the time (I think) you don't just use tc once, but you want to enable/disable impairments. This means control for tc needs to be implemented, e.g. enable/disable delay.

At that point I am hesitant to see a huge benefit in adding additional complexity whereas a shell script with pumba/netem to control tc parameters can easily do that.

We don't add iperf to clab, because iperf exists. I have the same feeling for tc

steiler commented 1 year ago

Yes, that's my thinking as well, you need multiple different kind of scenarious. But I think we should continue along the lines of "declarative infrastructure". Also these Link or Endpoint attributes should be defined declaratively.

My suggestion is the following:

links:
    - type: veth
      mtu: 1500
      endpoints:
      - node:          srl1
        interface:     ethernet-1/1
        tc-configs:
            "deploy":
                  delay: 500
                  jitter: 600
            "DelayJitterScenarioTwo":
                  delay: 500
                  jitter: 600
      - node:        srl2
        interface:    ethernet-1/1
        tc-configs:
            "DelayJitterScenarioOne":
                  delay: 500
                  jitter: 600
            "DelayJitterScenarioTwo":
                  delay: 500
                  jitter: 600

So you can define tc-configs under the endpoints of a link. And it would allow for multiple different declarations of tc configurations. In the deploy phase a certain "scenario" lets call it "deploy" will be deployed. (see endpoint srl1 - ethernet1-1). If no "deploy" scenario is configured, fine, no specific delay / jitter / loss settings are being configured.

Then what you would have is a seperate containerlab command, that would also take the topology file and a Scenario name. e.g. "DelayJitterScenarioTwo" and every endpoint with this tc-config scenario would be configured accordingly. every other endpoint is left aside. I'd probably be good to also have an implicit "reset" scenario where all the tc config would be purged. But also the post-deploy state could be deployed again via the scenario name "deploy".

All together, that would be declarative infra par excellence.

hellt commented 1 year ago

This is a very very niche use case. The declarativeness of tc configs doesn't make a big difference. What if you had a script in your repo that goes like this:

tc.sh scenario-1
tc.sh scenario-2

in this script you do whatever with tools which are designed to work with tc without adding more deps and code to clab for a niche use case. Yes, it is declarative, but in this case it is by design. You nevertheless would want to interact with tc during the lab - hence introducing interactivity.

jbemmel commented 1 year ago

I still think there is value in having a baseline set of tc parameters applied to links, to emulate more realistic network behavior. A WAN link with 10ms delay, a wireless link with 1% packet loss - that kind of thing

I don't think it would add a lot of complexity/code - just a few lines to read attributes and call the right 'tc' commands

steiler commented 1 year ago

PR exists: https://github.com/srl-labs/containerlab/pull/1453

hellt commented 1 year ago

I know this is not exactly what was requested, but for now this is going to be the way to declaratively define impairments:

name: netem
topology:
  nodes:
    r1:
      kind: linux
      image: alpine:3
      exec:
        - ip addr add 192.168.0.1/30 dev eth1
    r2:
      kind: linux
      image: alpine:3
      exec:
        - ip addr add 192.168.0.2/30 dev eth1
    ixp-net:
      kind: bridge
    host:
      kind: host
      exec:
        - /root/srl-labs/containerlab/bin/containerlab tools netem set -n clab-netem-r1 -i eth2 --delay 100ms --jitter 2ms --loss 10

  links:
    - endpoints: ["r1:eth1", "r2:eth1"]
    - endpoints: ["r1:eth2", "ixp-net:r1-eth1"]

Using exec under the kind=host node you call the clab tools netem set command that sets the impairments on a link.

steiler commented 1 year ago

Does it make sense to at least have a $self reference for the containerlab binary? Otherwise if someone does not stick to the installer or uses a private build, the path to containerlab binary would not resolve, so there is lack of portability.

hellt commented 1 year ago

It can be a magic var like __clabBin__. We have some of them defined for a topology.

Btw, I noticed that no stderr is printed for hosts exec when the command fails (I.e. when clab path is not resolved)

sk2 commented 1 year ago

I agree that the common case is to modify the impairments so having in the topology doesn’t cover all the use cases. Could there be a helper tool that reads the clab runtime info to simplify passing that to the netem command (and ideally able to be run as a daemon to allow scripting via REST)?

hellt commented 1 year ago

@sk2 can you give me an example for these runtime parameters you want to extract? What is the complication you have right now?

steiler commented 11 months ago

How do we now go about this? We have the https://containerlab.dev/rn/0.44/#link-impairments via tools netem command but thats not declaritive. So how to go about it? Close this issue or what are the next steps?