Linkerd Gateway logs spammed with "connections must be mutually authenticated" from kube-system probes

deusxanima commented 1 year ago

What is the issue?

After setting up a multicluster gateway on EKS/AKS/public cloud, the linkerd-gateway pod logs begin to fill with [11289622.527652s] INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=xx.xx.xx.xx:yyyy messages which seem to come in pairs and at regular intervals. Cross referencing the IPs with those in the cluster shows them belonging to kube-system resources.

How can it be reproduced?

Set up multicluster gateway in AKS/EKS
Dump linkerd-gateway proxy logs

Logs, error output, etc

Linekrd Proxy Logs:

[11274533.599664s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.9:7592
[11274539.218242s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.11:1660
[11274539.863360s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.7:40622
[11274541.495510s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.10:58648
[11274545.609375s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.9:62778
[11274545.871107s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.7:61744
[11274547.123340s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.244.4.1:43070
[11274548.240464s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.5:11445
[11274548.397156s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.8:8250
[11274551.240083s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.11:59661
[11274551.655555s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.13:4768
[11274553.509264s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.10:8139
[11274554.254054s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.5:31659
[11274554.944922s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.6:10228
[11274557.250662s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.11:51612
[11274559.130945s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.244.4.1:56944
[11274560.423311s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.8:56579
[11274563.260117s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.11:36653
[11274565.129972s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.244.4.1:48872
[11274565.525163s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.10:39419
[11274566.441485s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.4:31776
[11274569.645774s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.9:25113
[11274571.540168s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.10:55953
[11274572.274773s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.5:44481
[11274575.272429s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.11:20532
[11274578.448781s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.8:25605
[11274578.449608s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.4:23255
[11274581.711554s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.13:13415
[11274581.941588s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.7:56128
[11274584.465670s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.8:22656
[11274584.994409s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.6:36853
[11274587.950011s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.7:1678
[11274589.574747s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.10:59700
[11274590.473173s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.8:62671
[11274590.477575s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.4:3918
[11274593.661121s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.9:56948
[11274593.725261s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.13:18543
[11274593.952454s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.7:39646
[11274595.579505s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.10:45799
[11274596.478715s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.8:3088
[11274599.306816s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.11:55425
[11274605.673594s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.9:5852
[11274611.755424s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.13:13425
[11274611.974010s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.7:21155
[11274613.589365s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.10:64579
[11274614.512156s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.4:1610
[11274617.760955s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.13:60487
[11274617.978744s]  INFO ThreadId(01) inbound: linkerd_app_core::serve: Connection closed error=direct connections must be mutually authenticated error.sources=[direct connections must be mutually authenticated] client.addr=10.240.0.7:36115

Search for 10.240.0.4:

kube-system            pod/azure-ip-masq-agent-s6vlr                                         1/1     Running                      0               88d     10.240.0.4     aks-nodepool6-94229556-vmss000009   <none>           <none>
kube-system            pod/cloud-node-manager-xbdt9                                          1/1     Running                      0               88d     10.240.0.4     aks-nodepool6-94229556-vmss000009   <none>           <none>
kube-system            pod/csi-azuredisk-node-8nxm7                                          3/3     Running                      0               40d     10.240.0.4     aks-nodepool6-94229556-vmss000009   <none>           <none>
kube-system            pod/csi-azurefile-node-bb5xp                                          3/3     Running                      0               9d      10.240.0.4     aks-nodepool6-94229556-vmss000009   <none>           <none>
kube-system            pod/kube-proxy-h6r2m                                                  1/1     Running                      0               41d     10.240.0.4     aks-nodepool6-94229556-vmss000009   <none>           <none>

output of `linkerd check -o short`

n/a

Environment

AKS

Possible solution

Either exempt the kube-system probes from logs to avoid spam, or wrap up kube-system probe logs into a clean one-line summary

Additional context

No response

Would you like to work on fixing this bug?

None

adleong commented 1 year ago

Hi @deusxanima.

The Linkerd gateway is designed to only accept connections from meshed sources, so this is the behavior I would expect if unmeshed workloads in kube-system are trying to connect to it. It would be good to understand what these kube-system resources are doing and why they are attempting to initiate connections to the Linkerd gateway. I think understanding that is important before we decide if it's reasonable to special case our logging in any way.

mateiidavid commented 1 year ago

Weighing in here. The environment is EKS and AKS, but where are we seeing these logs? On the EKS or AKS side? Or both? On of our users reported seeing this in AKS; for LoadBalancer-type Service objects, apparently, by default we will have some TCP probes configured that target port 4143 (inbound). If the probes would correctly target the admin endpoint I suspect this wouldn't be issue. It seems to be a very similar issue. Full conversation is in https://github.com/linkerd/linkerd2/discussions/8148 but for convenience I quoted the original statement:

After experiencing this on AKS too, we tracked it down to the load balancer's health probe.

In short, this is not a problem however the logs are noisy and can hide other issues.

The issue seems to be for the proxy's data plane port (4131 by default), which wants mTLS traffic if directly addressed.

From the load balancer's TCP probe perspective: a TCP connection that is accepted and then closed remotely is considered a success. However from the linkerd-proxy's perspective, a non mTLS connection is bad and so it closes the connection, producing the log.

We had success with using some annotations on the linkerd-gateway service, indicating that the probe for the data plane should use the admin port's /ready endpoint.

In the helm values for the multicluster chart this would look something like (untested):
gateway:
  serviceAnnotations:
    "service.beta.kubernetes.io/port_4191_health-probe_protocol": "Http"
    "service.beta.kubernetes.io/port_4191_health-probe_request-path": "/ready"
    "service.beta.kubernetes.io/port_4143_health-probe_protocol": "Http"
    "service.beta.kubernetes.io/port_4143_health-probe_port": "4191"
    "service.beta.kubernetes.io/port_4143_health-probe_request-path": "/ready"
See https://cloud-provider-azure.sigs.k8s.io/topics/loadbalancer/#custom-load-balancer-health-probe-for-port for docs on what these annotations do.

@mateiidavid We sunk a few hours into this and I suppose other AKS folks are too, would it make sense to make these annotation a default or maybe add this to the docs somewhere? Maybe time to promote this from a discussion to an issue?

deusxanima commented 1 year ago

@mateiidavid , got clarification that the user seeing this in EKS was seeing a separate issue so I've edited original description to only cover AKS. I've made note of the recommended annotations to quiet down the logs and will work with affected users to apply them in the interim. I do wonder if it would make sense to make these annotations default for multicluster gateway deployments in order to avoid the log noise but I'll leave that to maintainer team to make a decision on.

deusxanima commented 1 year ago

Encountered this issue again while troubleshooting a few multicluster gateway liveness issues which made it very difficult to separate the noise out from the actual errors and required back-and-forth with user to gather info to match up logs with kube system IPs. Would be great to have the discussed annotations be made default in the charts going forward to help new users set up multicluster feature without overwhelming them with potentially uneccessary logs.

jeremychase commented 1 year ago

These log lines are useful for debugging, but are generally not actionable. The preferred solution is to use the DEBUG log level rather than INFO.

MrFreezeex commented 1 year ago

For reference, here is the annotations that was working on our EKS setup:

              service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: "HTTP"
              service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: "/ready"
              service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: "4191"

MicahSee commented 3 weeks ago

Can I work on this issue? Thanks!

linkerd / linkerd2