vmware-archive / octant

Highly extensible platform for developers to better understand the complexity of Kubernetes clusters.
https://octant.dev
Apache License 2.0
6.28k stars 483 forks source link

Visualize network policies or network topology graphs #430

Open bryanl opened 4 years ago

bryanl commented 4 years ago

Todo: is this even possible?

wwitzel3 commented 4 years ago

We could use the graphviz component to do this?

moderation commented 4 years ago

I'd settle for just listing the network policies in a table with ability to dig into the next level. A bonus would be able to do the same for Calico policy with networkpolicy.projectcalico.org but I expect that will be a plugin.

kubectl -o wide get netpol
NAME                     POD-SELECTOR         AGE
envoy                    app=envoy            4h26m
findhello-internal       app=findhello        18h
findhello-prom           app=findhello        18h
grafana                  app=grafana          18h
helloworld               app=helloworld       18h
helloworld-prom          app=helloworld       18h
jaeger-ingest            app=jaeger           18h
jaeger-ui                app=jaeger           18h
prometheus               app=prometheus       18h
statsdexporter-collect   app=statsdexporter   18h
statsdexporter-prom      app=statsdexporter   18h
jayunit100 commented 4 years ago

Yay !

Network policy's are really hard to reason about, especially because of the way they are additive.

ok, so i was tabulating alot of these today and just thinking how much id love to be able to see this in a live cluster. these relationships could be theoretically generated without even needing to use a probe of any sort.

but given that CNI providers might be broken in one area or another, having a probe that validated these types of relationships (i.e. at the namespace level) would be another extension of this.

Roughly heres what i look for in policies:

Example

pod-port connect policy destination
cc80 y allowCC80 mystrictpod
cc81 y allowcc81 mystrictpod
cc81 n deny-all mystrictpod

Note, network policies are additive so, the naive version of this plugin would just tabulate something like this.

Doing the math

A more advanced version of this could actually do the algebra and determine what policies were in effect, which would be RAD ! Similar to how in Octant CRD relationships are reverse engineered together.

One more note: Policies change over time. Having a watch of some sort so that you could see what policies were added and removed over time would be even cooler.

Sorry for the rambling, would like to chat about this possibly.

Musing on timeseries

I suppose if Octant makes a set of CRDs, i might start doing this by adding a "NetworkPolicyView" object which watched all policies by namespaces and all pods, and recalculated this stuff in the background periodically. Of course, this might miss the interesting time series issues (i.e. at time 0, these policies were in effect, at time 1, another set of policies were in effect).

The reason i like time series here is that when investigating how things change while running networkPolicy compliance tests, id like to be able to see a list of all the policies that were ever created, even if they were eventually deleted. That is obviously not a common use case.

jayunit100 commented 4 years ago

Ive been working on this as part of looking at network_policy.go in general. Heres how im visualizing them.

Screen Shot 2020-01-30 at 3 37 35 PM

bryanl commented 4 years ago

@jayunit100 It would be helpful if there was an API that could generate this data, so the Octant developers would not have dig too deeply into this tech.

jayunit100 commented 4 years ago

cc @sedefsavas

sedefsavas commented 4 years ago

One of the things we can show is network policies and the other is to visualize traffic and show the blocked traffic due to network policies. First one is relatively straightforward but for the second one, it either needs to collect traffic data in a CNI specific way (e.g., flow info can be collected from OVS for Antrea), or we will need to have probes that sample the traffic coming into and going out, match them with Pods/Services and detect any traffic that is missing.

YanzhaoLi commented 4 years ago

@sedefsavas Hi, could you please help me understand the probes sampling the traffic, coming into what and going out from what ?

sedefsavas commented 4 years ago

Having daemons listening on the interfaces and taking samples from incoming/outgoing traffic may help to show which pods talking with each other. On second thought, I am not sure the value of showing this though in terms of network policies because having no traffic between 2 pods does not mean they cannot communicate.

One good addition to https://github.com/vmware-tanzu/octant/pull/813 would be to have a reachability matrix that shows which pods/services are allowed to talk with each other given all the network policies in the network. By this way, for debugging purposes, users can checkout the matrix and see if any unexpected pod-to-pod communication is allowed manually.