Closed chris-vest closed 4 years ago
Does linkerd tap -ojson
show the l5d-dst-override
header? I assume your ingress controller is configured to terminate external TLS? Otherwise, Linkerd will just treat your encrypted traffic as TCP, and not mTLS'd it.
Also, I believe we confirmed on Slack that you are using ingress-nginx
, not nginx-ingress
.
Does
linkerd tap -ojson
show thel5d-dst-override
header? I assume your ingress controller is configured to terminate external TLS? Otherwise, Linkerd will just treat your encrypted traffic as TCP, and not mTLS'd it.
Yes, the ingress controller is terminating TLS, you can see the Ingress configuration above and it complies with the configuration from the docs here.
It doesn't look like linkerd tap -n httpbin deploy/httpbin -ojson
shows the l5d-dst-override
header; this is true for both traffic originating from ingress and a call from another pod inside the cluster to the httpbin
service.
This is how we have configured l5d-dst-override
in the Helm values file for ingress-nginx
:
controller:
config:
use-proxy-protocol: "true"
use-gzip: "true"
use-geoip: "true"
skip-access-log-urls: "/healthz"
server-snippet: |
proxy_set_header l5d-dst-override $service_name.$namespace.svc.cluster.local:$service_port;
grpc_set_header l5d-dst-override $service_name.$namespace.svc.cluster.local:$service_port;
For this clean reproduction of the issue, I also added that configuration to the Ingress object, like it explains in the Linkerd documentation.
Also, I believe we confirmed on Slack that you are using
ingress-nginx
, notnginx-ingress
.
Yes, exactly, we are using ingress-nginx
. Usually we use version 1.41.2 of the chart, but I have now tried with version 2.16.0 (latest as of writing this post) and I can still not see the l5d-dst-override
header using linkerd tap -ojson
.
@chris-vest I am just wondering what happens when you run curl https://httpbin.dev.org/status/200
a second time? On my end, the tap
output always shows mTLS isn't working on the first request, but works on subsequent calls. See https://github.com/linkerd/linkerd2/issues/4992. One other way to confirm if mTLS is working is to use linkerd -n httpbin edges po
.
@ihcsim So the edges
command always shows as mTLS being fine:
➜ linkerd -n httpbin edges po
SRC DST SRC_NS DST_NS SECURED
linkerd-prometheus-7bbfd6c474-jphtc httpbin-779c54bf49-drv7r linkerd httpbin √
However the tap
output would indicate otherwise because it's logging no_tls_from_remote
.
Regarding consecutive curl
requests, this still produces the same output from tap
- consistently for traffic originating from ingress, tap will show no_tls_from_remote
. However, pod-to-pod traffic, i.e. manually curling from inside the ingress-nginx-controller
pod to the httpbin
service will yield tls=true
.
I haven't noticed what you've described in #4992, but to be honest I've just been focused on the traffic originating from ingress.
Thanks for looking into this!
Can you tell if each no_tls_from_remote
output corresponds to a DNS refinement timeout
error in the outbound (nginx) proxy log? If yes, then it's the same issue as #4992. That stuff will be removed before the next stable release.
No, not getting any DNS refinement timeout
errors, however I have just noticed this in the proxy logs on the Nginx pods:
WARN ThreadId(02) rustls::session: Sending fatal alert AccessDenied
Although having said that, restarting the pods that error message is no longer present. Seems like that was just a fleeting error.
I'm about to try 20.10.2-edge
, since 20.10.1-edge
has Changed the type of the injector and tap API secrets to kubernetes.io/tls
which may help... Maybe.
So with 20.10.2-edge
, I can see the proxy
log on the nginx
pods returns:
WARN ThreadId(01) linkerd2_proxy_discover::buffer: Discovery stream ended!
This is thrown by all of the 3 proxies once, I guess the first time the hop passes that pod where the proxy is attached.
This should be fixed in the edge-20.10.3 release. This release removes DNS resolution from the outbound path and overhauls discovery to avoid doing per-request work.
Please give this us a try and let us know how it works for you.
@olix0r @ihcsim Thank you both very much! This works perfectly. :rocket:
Bug Report
What is the issue?
Traffic originating from ingress is not mTLS'd, however pod to pod traffic is.
How can it be reproduced?
Create a new AWS EKS cluster, v1.16.
Install kube2iam and cert-manager. Follow the guide on automatically rotating control plane TLS credentials here - https://linkerd.io/2/tasks/automatically-rotating-control-plane-tls-credentials/
Install Linkerd Edge 20.9.2 via Helm Chart:
Deploy an
nginx-ingress
controller (with the configuration described here https://linkerd.io/2/tasks/using-ingress/) and set up a Route 53 entry simple A record pointing to the load balancer which is created by the controller.Deploy HTTPBIN (https://httpbin.org) with a service and ingress:
Now,
tap
thenginx-ingress
namespace and:Watch the
tap
output:no_tls_from_remote
indicates there's no TLS on that hop. Okay, let's see what is happening on the other side of the call on thehttpbin
deployment:No TLS there either.
What about if we call
httpbin
service directly from inside thenginx-controller
just to see if that makes any difference?Now the
tap
output:Logs, error output, etc
Tap logs:
linkerd check
outputEnvironment
Additional context
I would expect that, no matter where traffic originates from, all pod to pod traffic inside the cluster is mTLS'd. However from my tests this does not seem to be the case.
This behaviour was also observed on edge 20.9.1 and 20.9.0.
Any help is greatly appreciated. :pray: