netenglabs / suzieq

Using network observability to operate and design healthier networks
https://www.stardustsystems.net/
Apache License 2.0
775 stars 105 forks source link

[Feature]: Kubernetes overlay tracking with CNI's like Cillium, Flannel, Calico #923

Open jmessenger51 opened 4 months ago

jmessenger51 commented 4 months ago

Suzieq version

0.22.0

Install Type

hand deployed python

Feature type

Extend sq-poller

Use case

Containerized environments are more and more prevalent. It would be nice if suzieq could gather container information from Kubernetes to provide end-to-end visability down the entire stack. What I see is a container mapped to a Container Network Interface (Cillium / Flannel / etc.) which builds a Geneve or VXLAN tunnel between the compute nodes and may or may not use host routing. Some Kubernetes CNI's are L2 from the switching fabric into the host and the host has a VTEP for the containers, some CNI's use BGP for ECMP out of the host to the switching fabric.

Proposed functionality/solution

Suzieq should poll Kubernetes either via the Kubernetes rest api or kubectl commands

Specific kubectl command examples: kubectl get pods -n kube-system will show all pods

kubeclt describe pods --namespace <namespace> will show details about the pod, container id, state, etc.

kubectl get pod -o wide shows the pod's name, state, status, age, IP address information, and the node (host) it resides on. A similar command would be kubectl get pods --all-namespaces -o wide

kubectl get service --all-namespaces shows the namespaces for all the services, the cluster ip, external-ip (if there is one), ports exposed, etc.

If suzieq has ssh access to the linux host, then linux commands can be used to help finish the kubernetes picture.

This will allow containers within kubernetes to be mapped through the host to host overlays.

External dependencies

kubernetes-client for python - https://github.com/kubernetes-client/python

Additional Context

ddutt commented 4 months ago

Hi @jmessenger51 Thanks for the issue. I've been waiting for someone to ask for this. Can't promise a timeline, but certainly will look into this