sflow / vpp-sflow

sFlow plugin for VPP
Apache License 2.0
6 stars 0 forks source link

ratio should be per thread+interface, not per thread #3

Open pimvanpelt opened 2 months ago

pimvanpelt commented 2 months ago

Currently, the sampling ratio is a global setting for the plugin.

The main concern is that a VPP instance may have 1G and 100G interfaces, or it may have some interfaces with lower and higher traffic levels even if they are the same port speed (eg loaded only 1-5Mbit on 10G versus 8Gbit on 10G).

But I think this might become problematic for a more fundamental reason:

Assume Interface A wants 1:1000 and Interface B also wants 1:1000

Problem 1: If we have two interface each with 1 RX queue, we may place them on the same worker:

set int rx-placement A worker 1
set int rx-placement B worker 1

Now the same node process sflow on worker 1 will be sampling packets from the vector at 1:1000, and some packets will be from in=A and others will be in=B. Result: neither A nor B will have a sampling rate of 1:1000

Problem 2: If we have one interface with 2 RX queues, we may place these on different workers:

set int rx-placement A queue 0 worker 1
set int rx-placement A queue 1 worker 2

Now the interface will have sampling of 1:1000 on worker1 and 1:1000 on worker2, and will effectively sample 1:500 packets from interface A.

Is it a problem if the effective rate is different to the configured rate (can be up to 10x lower in the case of many interfaces one the thread, or 10x overshoot in the case of many threads servicing one interface) ?

pimvanpelt commented 2 months ago

Incidentally, having a per-interface config (which might be NULL if the interface is not enabled for sflow) will also allow us to avoid the problem described in #2 and only insert the node at most once on the feature arc.

sflow commented 2 months ago

Yes. Per-interface sampling rates is a desirable enhancement. Not too hard to add later. Just means that each worker thread will need it's own per-interface sampling state. Every thread will see it's own share of the packets on an interface X. The same packet will not be seen by two threads (right?). If they all sample interface X packets at the same 1:N ratio then the combined effect that is presented to hsflowd via PSAMPLE will be 1:N. hsflowd will not need to know how many worker threads were involved. The bank of worker-samplers will look like one.

hsflowd allows sampling-N to be set in a number of ways. The default is actually a function of the interface speed in bps:

N = ifspeed/1e6

So that a 1Gbps port is sampled at 1:1000, and a 10Gbps port is sampled at 1:10000.

You can turn the above off. You can override N for all interfaces of a given speed. You can override N for an individual interface.

The advantage of this flexibility is that you can often get away with having the exact same config for every hsflowd instance in your network, which makes configuration via Puppet/Chef/Kubnernetes/DNS-SD nice and easy. But we'll need to think about the relationship between hsflowd and vpp. For example, we might have hsflowd's mod_vpp learn the interfaces (and their speeds) and set up sampling using vppctl commands. If we do it that way then all we need from the vpp CLI is a command to set the sampling-rate for a given interface.

I think that would also work for the way that VyOS configures hsflowd.

So I'm not sure if the vpp-sflow CLI needs to support elaborate per-speed settings or wildcards. Thoughts?

sflow commented 2 months ago

One concern about per-interface sampling rates is that a frame of packets is not necessarily all from the same interface. So in a simplistic implementation a worker-thread would have to look at every packet in the frame to see what interface it was on before it could apply the sampling...

... a possible solution to that is to quantize the allowable sampling rates so that they have to be multiples of, say, 1000. Then the workers can apply an initial random 1:1000 sampling followed by a per-interface sub-sampling factor to determine if they really are going to sample this packet and write it to the FIFO. This adds branching, machine-instructions and a per-interface array of sub-sampling counters to the worker thread, but I guess if it is divided by 1000 it might be OK?

sflow commented 2 months ago

It may be worth the trouble to write code to adjust the sub-sampling dynamically. So if we compute the highest-common-factor (HCF) of the current sampling rates then that can be the random sampling rate that the worker-threads apply first, and the others can be sub-sampled from that feed. That way we can still be flexible and allow any setting. Someone testing in the lab can use 1-in-10 for interface A and 1-in-20 for interface B and we will compute HCF=10, with sub-sampling A=1, B=2. Then in a more real-world scenario someone using 1-in-100K for interface A and 1-in-500K for interface B would result in HCF = 100K and sub-sampling A=1, B=5. That would maximize efficiency.

Of course if someone tries to sample at 1-in-1000 for interface A and 1-in-1001 for interface B then HCF=1 and it falls apart, so some constraints might be appropriate to protect users (e.g. "only two significant figures allowed"). Perhaps it will be obvious what to do when we get around to coding it.