Open cathay4t opened 10 months ago
@cathay4t about first bullet we can generalize this by counters per nested interfaces ? like vlan(linux(bond)) -> 3 linux(vlan) -> 4 ovs(vlan) -> 5 something like this ?
Initially, I would like to collect date for:
vlan
, linux-bridge
, vlan-over-bridge-over-bond
, ovs-bridge
etc.mac-based-identifier
, ovn-mapping
. With this we could know adoption rate of implemented features.move ip from eth to bridge
, change dns nameservers
, switch from dynamic ip to static
. We can use nmstate tier1 test case name for naming these use cases, so we have clear definition.Each cluster only count as one for topology/feature/use case regardless how many interfaces it has or how many NNCP it has. So our data does not imfluenced by a big cluster with 1000+ VLANs or NNCPs.
I will create demo like nmstatectl gen-statistics <desire_state_file> [-c <current_state_file>]
:
topologies:
- vlan
- linux-bridge-over-bond-over-sriov-vf
features:
- nm-global-dns
- mac-identifier
- sriov-vf-reference
- ovn-mapping
use-cases:
- edit_static_ipv4_address_and_prefix
- disable_static_ipv6
Just a note that when I've had conversations about telemetry in the past, I was told the amount of data we're allowed to send for it is extremely limited. Like instead of JSON booleans, we have to use bitmasks where each bit is mapped to a given key. I haven't actually confirmed that myself, but before we come up with a complex set of values that we want to return we may want to confirm that we can actually represent that in the amount of data we're allowed to send.
@cybertron @cathay4t this is poc to integrate some statistics, from there we can reduce optimize what we need
https://github.com/nmstate/kubernetes-nmstate/pull/1210
Looks like the top 10 can be filter at prometheus using topk(10, sum(nmstate_apply_topology_total) by (name))
We are hoping to get data on
This could help us on planning CI coverage and backport patches.