Ship NetworkPolicies for Kubernetes control plane

iaguis commented 4 years ago

We should ship NetworkPolicies limiting what control plane components can connect to by default.

surajssd commented 4 years ago

Note: Most control plane components when they talk to apiserver they talk via service kubernetes in default namespace. So keep this in mind when creating and network policy.

ipochi commented 4 years ago

Do we need to have NetworkPolicies for both Ingress and Egress for the control plane workloads. Hosted kubelet and etcd are not workloads but part of the control plane, do we need NetworkPolicies around them as well and how ?

Strategy of creating Networking policies.

Deny all ports by default and then create policies for known ports for each control plane workload for everyone.

Example: Kube-scheduler listens on secure port 10259. We would create two policies (a) deny all (b) policy that allows any ingress traffic at port 10259.

Filter who has access to known ports and create NetworkPolicies accordingly

Example: Filter who needs to listen to which ports and then allow policies for only those specific applications. Example Prometheus operator needs to listen to 10259 to scrape metrics.

In another words, NetworkPolicies would be application centric and each workload would need to create its own NetworkPolicy that needs to talk to the control plane (perhaps we can define a way that lokoctl creates the Network Policy as part of the deployment)

ipochi commented 4 years ago

In continuation of the above, below is one of the example:

Conundrum: Network Policy that allows scraping of metrics for lets say coredns control-plane component.

There could be two ways to go about implementing such a network policy.

Allow all to scrape metrics, regardless of the namespace(source pod) to port 9153. If we provide a podSelector to the same, it will only consider pods in the same namespace in which the policy is deployed i.e kube-system namespace, which may or not be be the case for Prometheus operator.

Pros: Simple policy, builds up on the deny-all-ingress. Cons: Doesn't care for the source namespace.

Allow only specific application to scrape metrics such as Prometheus-operator component. We can do that my mixing podSelector and namespaceSelector to allow cross-namespace communication.

However, In order to allow for cross namespace communication , we need to know the namespace in which Prometheus operator is going to be deployed (default is monitoring) in advance. User can change the namespace in the prometheus operator configuration.

Since this namespace name won't be known in advance when applying Network Policies( as part of the helm chart of each control plane component), we cannot implement such a network policy without a workaround.

I can think of two workarounds for this:

(a) Remove the flexibility of allowing the user to choose a namespace in which prometheus operator component gets deployed, then the NetworkPolicy can have a hard coded value monitoring in the namespaceSelector. However this creates a problem as to if the user doesn't want to deploy Prometheus operator component to scrape metrics but instead has another solution, they will be forced install in the monitoring namespace and add/provide labels in accordance with the network policy.

(b) By default scraping of metrics at 9153 is not allowed and then we implement a BYO Network policy to secure the communication.

In such a case new manifests to be added to our Prometheus operator component proving Ingress and Egress rules for cross namespace Network policy.

The slight annoyance here is that Prometheus operator component will have a Network Policy manifest having Ingress rule that would be the property of CoreDNS( i.e ingress rule network policy will be defined in kube-system namespace, as at this point we can use namespaceSelector to figure in which namespace the component is deployed) and Egress in the namespace chosen by the user to install Prometheus operator.

I believe that any Network Policy of a component/workload should only contain Egress and Ingress rules only for itself but I think that since we govern/control the manifests for the Prometheus operator component, its not that big of a deal.

Also in the case where user doesn't want to scrape metrics using the supported component and brings its own then the onus is on the user to secure communications using the same BYO Network Policies principle.

invidian commented 4 years ago

@ipochi we could also have labels dedicated for bypassing firewall, like coredns-scraper for pods and the namespace, and then we add them to prometheus pod. This way, if user wants to scrape the metrics on their own the only thing they need to do is to label their deployments and namespaces.

ipochi commented 4 years ago

Kubernetes NetworkPolicy or Calico NetworkPolicy/GlobalNetworkPolicy cannot be applied to pods running on host network.

In Lokomotive, the following control plane components run host network hostNetwork: true:

kube-apiserver
kubelet
kube-proxy
calico
pod-checkpointer

This creates an issue when creating NetworkPolicies to restrict the communications to and within the control plane, as traffic on pods running in the host network are indistinguishable from the host traffic.

Example: To restrict calico-kube-controllers pods to only talk to the api-server , one must create a network policy like below:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: calico-kube-controller-allow-egress
  namespace: kube-system
spec:
  podSelector:
    matchLabels:
      k8s-app: calico-kube-controllers
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          k8s-app: kube-apiserver
          tier: control-plane
    ports:
    - port: 6443
      protocol: TCP
    - port: 7443
      protocol: TCP
  - to:
    - podSelector:
        matchLabels:
          k8s-app: coredns
          tier: control-plane
    ports:
    - port: 53
      protocol: TCP
    - port: 53
      protocol: UDP
  - to:
    - namespaceSelector:
        matchLabels:
          lokomotive.kinvolk.io/name: default
      podSelector: {}
    ports:
    - port: 443
      protocol: TCP

Using the podSelector to select api-server pods creates the problem as we cannot use podSelector for the destination for pods running in the host network.

Vice versa creating a Egress policy for the host network pods also result in the same limitation. The above limitation exists for Calico NetworkPolicy and GlobalNetworkPolicy.

Whilst we can remove the podSelector and use only the ports for the destination in the above sample policy, we don't get the granularity of selecting the pods.

A workaround could be to create HostEndpoint objects in AWS (already present in Packet) and create GlobalNetworkPolicies for securing the control plane communication, however such a workaround makes sense to secure the traffic coming from outside the cluster and not from pods within the cluster.

iaguis commented 3 years ago

Closing as this is not feasible.

kinvolk / lokomotive

Ship NetworkPolicies for Kubernetes control plane #315