Healthchecks are not passed when `all-authenticated` policy mode is on

aatarasoff commented 3 years ago

Bug Report

What is the issue?

After turning on the all-authenticated policy mode, affected pods cannot start because of 403 error.

How can it be reproduced?

Just add config.linkerd.io/default-inbound-policy: all-authenticated annotation on a namespace.

Logs, error output, etc

{"timestamp":"[    48.376757s]","level":"INFO","fields":{"message":"Request failed","error":"unauthorized connection on server default:all-authenticated"},"target":"linkerd_app_core::errors::respond","spans":[{"name":"inbound"},{"client.addr":"<kubelet_ip>","name":"accept"},{"port":80,"name":"server"},{"v":"1.x","name":"http"},{"client.addr":"<kubelet_ip>","name":"rescue"}],"threadId":"ThreadId(1)"}

{"timestamp":"[    48.376773s]","level":"DEBUG","fields":{"message":"Handling error on HTTP connection","status":"403 Forbidden","version":"HTTP/1.1","close":"false"},"target":"linkerd_app_core::errors::respond","spans":[{"name":"inbound"},{"client.addr":"<kubelet_ip>","name":"accept"},{"port":80,"name":"server"},{"v":"1.x","name":"http"}],"threadId":"ThreadId(1)"}

`linkerd check` output

===================

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
√ cluster networks contains all node podCIDRs

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
√ control plane and cli versions match

linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match

linkerd-ha-checks
-----------------
√ pod injection disabled on kube-system

Status check results are √

Environment

Kubernetes Version: 1.20.9-gke.1001
Cluster Environment: GKE
Host OS: Container-Optimized OS with Docker (cos)
Linkerd version: 2.11.0

Possible solution

Something like this (https://istio.io/latest/docs/ops/configuration/mesh/app-health-check/) would be nice.

Workaround:

---
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
  namespace: coolns
  name: serving-port
  labels:
    anna.money/server-type: common
spec:
  podSelector:
    matchLabels:
      linkerd.io/control-plane-ns: linkerd
  port: serving-port
---
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
  namespace: coolns
  name: linkerd-admin-port
  labels:
    anna.money/server-type: common
spec:
  podSelector:
    matchLabels:
      linkerd.io/control-plane-ns: linkerd
  port: linkerd-admin
---
apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
  namespace: coolns
  name: allow-healthchecks
spec:
  server:
    selector:
      matchLabels:
        anna.money/server-type: common
  client:
    unauthenticated: true
    networks:
# kubelet CIDRs here, for example with templating
{% for n in range(x, y) %}
{% for m in range(0, 256) %}
      - cidr: 10.{{ n }}.{{ m }}.1/32
{% endfor %}
{% endfor %}
---
apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
  namespace: coolns
  name: allow-authenticated
spec:
  server:
    selector:
      matchLabels:
        anna.money/server-type: common
  client:
    meshTLS:
      identities:
        - "*"

Additional context

Original thread from official slack: https://linkerd.slack.com/archives/C89RTCWJF/p1633592434126000

olix0r commented 3 years ago

You'll need to create authorizations that permit unauthenticated connections on the ports that require connections from kubelet. Unfortunately, we don't have any mechanism to discover the source IPs for kubelet probes.

aatarasoff commented 3 years ago

@olix0r it was the first that I did.

I've added this ServerAuthorization to allow kubelet requests pods:

---
apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
  namespace: coolns
  name: allow-healthchecks
spec:
  server:
    selector:
      matchLabels:
        anna.money/server-type: common
  client:
    unauthenticated: true
    networks:
# kubelet CIDRs here, for example with templating
{% for n in range(x, y) %}
{% for m in range(0, 256) %}
      - cidr: 10.{{ n }}.{{ m }}.1/32
{% endfor %}
{% endfor %}

It solved the problem with healthchecks but add the new one. Pods started to reject any request from other pods:

{"timestamp":"[   378.100787s]","level":"INFO","fields":{"message":"Request denied","server":"serving-port","tls":"Some(Established { client_id: Some(ClientId(\"nginx-ingress.serviceaccount.identity.linkerd.cluster.local\")), negotiated_protocol: None })","client":"<other_pod_address>"},"target":"linkerd_app_inbound::policy::authorize::http","spans":[{"name":"inbound"},{"port":80,"name":"server"}],"threadId":"ThreadId(1)"}

{"timestamp":"[   378.100852s]","level":"INFO","fields":{"message":"Request failed","error":"unauthorized connection on server serving-port"},"target":"linkerd_app_core::errors::respond","spans":[{"name":"inbound"},{"port":80,"name":"server"},{"client.addr":"<other_pod_address>","name":"rescue"}],"threadId":"ThreadId(1)"}

Then I added another one authorization to explicitly allow meshed connections:

apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
  namespace: coolns
  name: allow-authenticated
spec:
  server:
    selector:
      matchLabels:
        anna.money/server-type: common
  client:
    meshTLS:
      identities:
        - "*"

It works. But it means that (at least in my case) all-authenticated mode is useless because it is clearer to use deny mode and setup policies manually.

Moreover, this workaround relies on internal implementation of GKE and some magic knowledge. It leads to potential risks in case of changes in GKE that we cannot control.

I see that Istio try to fix the healthchecks problem another way: https://istio.io/latest/docs/ops/configuration/mesh/app-health-check/. What do you think about this approach? For me it is more transparent for an end user.

kleimkuhler commented 3 years ago

@aatarasoff Looking at the Servers and ServerAuthorizations that you created I think there is an easier solution here.

srv/serving-port is selected by saz/allow-healthchecks because it's selector anna.money/server-type: common selects both of the Servers that you created. This means that saz/allow-healthchecks is only allowing traffic from the kubelet CIDRs to both of these Servers.

To fix this, you should be okay omitting srv/serving-port and only create srv/linkerd-admin-port and saz/allow-healthchecks. This means that ports besides linkerd-admin will not have a selecting Server and they should fall back to the cluster default of allow-authenticated.

Let me know if this solves your issue.

aatarasoff commented 3 years ago

@kleimkuhler Hi!

No, it is not working. If we omit serving-port then healthchecks are not passed for the main container. They are passed for Linkerd only.

As you could on screenshots api container restarts because it cannot pass readiness and liveness probes:

Ziphone commented 3 years ago

This issue hit me too. AKS (with Azure CNI) unfortunately forces me to host nodes and pods within the same IP range - so I am not able to make an exception for some IP range which has been reserved for nodes.

The possible/proposed solution of Istio seems nice. Have you considered implementing something similar?

adleong commented 3 years ago

@Ziphone it seems like a reasonable approach, but would take some design and work to implement something like that. Such a feature isn't currently scoped or planned.

m1o1 commented 2 years ago

This feature would be great - also on AKS, so I'm going to try doing the workaround of listing all the possible .1 addresses in the CIDR block, but this means we need to make assumptions about / be aware of the CIDR ranges and IP address of the kubelet. If they ever change (however unlikely that might be), that could cause health checks to fail again or possibly other workloads getting access that shouldn't have it.

Edit* well, haven't gotten the workaround to work yet. It's still rejecting the health checks even if I list every .1 address in the pod CIDR range (also tried the node CIDR range).

aatarasoff commented 2 years ago

@adleong @kleimkuhler Hi again. I've checked how Istio does it and think that we can do something similar. I do not know well Linkerd implementation specifics, but I create some principal schemes with comments.

First, let's look at the as-is diagram: Managed Services - Frame 17

We have two containers in POD: the main container with healthchecks and the linkerd-proxy with it. And at the moment, we should add an implementation-specific authorization policy for both of them.

To fix this, we can do the trick and rewrite the main container spec definition during injection to reroute healthchecks from it to linkerd-proxy that should process them accordingly. For example, the proxy should call a real healthcheck on the main container. To keep information about real healthchecks, we can use an environment variable. Managed Services - Frame 18

After that, the problem with the requirement of wide-scope unauthorized policy will still exist (but only for linkerd-proxy admin port). To eliminate it, we should grant non-mTLS connections to the proxy admin port (maybe by specific routes only) or introduce a new one for this particular reason. Managed Services - Frame 19

So, what do you think about this view from 30 000 feet?

m1o1 commented 2 years ago

I like the idea of a separate port for probes. I don't know what the admin port in linkerd gives access to, but giving unauthenticated access to it sounds scary (and will probably not sit well with our security team).

aw185176 commented 2 years ago

Am I understanding correctly that as things stand today, any ports that are used for healthchecking must be completely opened for unauthenticated requests from outside of the mesh? For most off-the-shelf applications, the K8s probes are hitting the same endpoints as served by the app for the actual business logic, which means that ServerAuthorizations are basically useless. Unless dedicated ports are used for probes, you are not able to guarantee that traffic to that service is coming from inside of the mesh. This seems like a nonstarter for any real world applications where you don't control the source code of every single component deployed on your clusters (does this situation even exist in K8s...?).

I noticed that this was added to stable-2.12.0, but I don't see any other linked issues that outline how this problem will be solved in stable-2.12.0

mikebell90 commented 2 years ago

To my mind, probes should be a first class citizen and auto proxied, like in istio and consul. I think the attempt to fit them into the generic framework "Oh this is just someone hitting this port, this path", is understandable but leads to a huge mess of replicating the same exceptions. I'd much rather @mateiidavid have an annotation that "magically" reads the existing port/path definition and rewrites them to the proxy.

olix0r commented 2 years ago

@mikebell90 I'm sympathetic to the pains around authorizing probes.

I think it's dangerous, however, to have implicit/"magical" authorization policies. For authorizations to be auditable & debuggable, we really want to tie all authorized requests to a resource in the cluster. That said, I think there's a path to making this automated.

In 2.12, we are introducing route policies: that is, we'll have the ability to say that the /ready endpoint does not require authentication. Once this is in place, it should be possible to write a controller that automates the creation of authorizations for probe routes.

mikebell90 commented 2 years ago

i get what you are saying and as an architect myself I sympathize. However purity here leads to either repeated boilerplate or people writing a controller (or using something like kyverno I guess). This is a lot of extra machinery for something that nearly every deployment needs. That would be my counter argument. @olix0r . Probes ain't exactly uncommon, and this is at the heart of deployment, and will likely lead to product engineer frustration whenever this is not setup properly.

olix0r commented 2 years ago

@mikebell90 Yeah, those are good points. In any case, the fully automated solutions is probably out of scope for 2.12.0; but we'll look more closely at a better solution as this stuff lands on main.

mikebell90 commented 2 years ago

Yeah all I’m suggesting is an open mind and keep an eye on user feedback.

in particular composable policies and ways to trace them in the cli and viz - I know buoyant has some of these

m1o1 commented 2 years ago

Is there anything upstream in K8s planned that would enable the kubelet to have a service identity, so we can authorize the health probes only for the kubelet?

jarihaikonen commented 2 years ago

I've been fiddling with this back and forth and two options COULD help in specific scenarios

setting like opaque ports but for cidr, that does not work like server & server authorization and force everything else to be another authorization as well
all/* for server port specification, or serving-ports (if I understood it correctly and serving-port is just meant to mean "what ever port you are using)

With the first setting you could exclude node subnet range, and allow kubelet health checks to work, if you have different subnets for pod/service cidrs. Only thing is that the pods that are using host network, would be bypassed as well

With the second setting you could achieve the same, and not to have specify all ports that are serving something in the same namespace. it's annoying if you have a) same ports for actual service + health check b) multiple services that have different ports for service / health check and don't mind about node network being able to connect to those unauthenticated.

Of course best would be scraping & mutating the Liveness & Readiness probes to go through the proxy linkerd-admin port and allow that for node network or so.

olix0r commented 2 years ago

Last week's edge release (or the prior week's?) included the ability to explicitly set per-route authorization policies on inbound traffic. The upcoming edge release will include automatic authorizations for probe endpoints (when no routes are explicitly configured on a server). We're still working through documentation and deeper testing in preparation for the stable-2.12 release--we'll share this documentation on this issue as it's available.

chriskim06 commented 2 years ago

The upcoming edge release will include automatic authorizations for probe endpoints (when no routes are explicitly configured on a server)

did this make it into the stable 2.12 release? im currently using a serverauthorization with our node cidrs to allow kubelet probes in an all-authenticated cluster. wondering if im able to delete that

aatarasoff commented 2 years ago

@chriskim06 yep. We have already deleted them and everything works good 👍

chriskim06 commented 2 years ago

@aatarasoff that's awesome 👍, glad i can get rid of that then

when no routes are explicitly configured on a server

if i were to configure an httproute for a server would that mean there wouldn't be automatic authorization for these probes? if so is there an option besides configuring a networkauthentication with the node cidrs like we did pre 2.12?

kleimkuhler commented 2 years ago

if i were to configure an httproute for a server would that mean there wouldn't be automatic authorization for these probes?

That's correct. You would need to create a NetworkAuthentication for Pods selected by that Server now, but unlike before you can at least authenticate only the paths /live and /ready for example. If you take a look at creating per-route policy resources you'll see how to create those resources for authenticating specific paths.

kleimkuhler commented 2 years ago

The Default Authorizations section of the docs covers how healthchecks are now authorized by default.

linkerd / linkerd2