istio / istio.io

Source for the istio.io site
https://istio.io/
Apache License 2.0
761 stars 1.54k forks source link

Egress Gateways with TLS Origination Example #7063

Open ronamosa opened 4 years ago

ronamosa commented 4 years ago

I have had a lot of trouble getting the example at https://istio.io/docs/tasks/traffic-management/egress/egress-gateway-tls-origination/ to work.

I modified for it for a real external mtls server -- a vm with nginx running with the same cert setup as the documentation and I tested it works if you use curl and the correct certificates.

I may be misunderstanding some things in the config, but have not found much help in discuss.istio.io or the slack channel.

When I follow the example (substituting for my own values) I get different errors in different areas and I'll point them out below.

Here is my setup as per documentation:

Installation

installed version

client version: 1.5.0
control plane version: 1.5.0
data plane version: none

How I installed Istio:

istioctl manifest apply --set values.global.istioNamespace=istio-system \
    --set values.gateways.istio-ingressgateway.enabled=false \
    --set values.gateways.istio-egressgateway.enabled=true \
    --set values.global.proxy.accessLogFile="/dev/stdout" \
    --set values.sidecarInjectorWebhook.rewriteAppHTTPProbe=true

Certs & Patch Egressgateway

I create the client & ca-cert secrets in the istio-system namespace:

$ kubectl -n istio-system create secret tls nginx-client-certs --key certs/4_client/private/mtls.site.key.pem --cert certs/4_client/certs/mtls.site.cert.pem
$ kubectl -n istio-system create secret generic nginx-ca-certs --from-file=certs/2_intermediate/certs/ca-chain.cert.pem

I patch the istio-egressgateway deployment to add the secrets/certs volumes and mounts and can see them when I check the pod:

example nginx-client-certs

$ kubectl -n istio-system exec -ti istio-egressgateway-8544965cd5-2hdnc -- ls -al /etc/istio/nginx-client-certs
total 8
drwxrwxrwt 3 root root  120 Apr  8 13:56 .
drwxr-xr-x 1 root root 4096 Apr  8 13:56 ..
drwxr-xr-x 2 root root   80 Apr  8 13:56 ..2020_04_08_13_56_42.418467475
lrwxrwxrwx 1 root root   31 Apr  8 13:56 ..data -> ..2020_04_08_13_56_42.418467475
lrwxrwxrwx 1 root root   14 Apr  8 13:56 tls.crt -> ..data/tls.crt
lrwxrwxrwx 1 root root   14 Apr  8 13:56 tls.key -> ..data/tls.key

TLS Origination Configs

If I start following the example from Perform mutual TLS origination with an egress gateway

I end up with the following configuration:

---
apiVersion: networking.istio.io/v1alpha3
kind: Gatewayservice
metadata:
  name: istio-egressgateway
spec:
  selector:
    istio: egressgateway
  servers:
  - port:
      number: 443
      name: https
      protocol: HTTPS
    hosts:
    - mtls.site
    tls:
      mode: MUTUAL
      serverCertificate: /etc/certs/cert-chain.pem
      privateKey: /etc/certs/key.pem
      caCertificates: /etc/certs/root-cert.pem

---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: egressgateway-for-nginx
spec:
  host: istio-egressgateway.istio-system.svc.cluster.local
  subsets:
  - name: nginx
    trafficPolicy:
      loadBalancer:
        simple: ROUND_ROBIN
      portLevelSettings:
      - port:
          number: 443
        tls:
          mode: ISTIO_MUTUAL
          sni: mtls.site

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: direct-nginx-through-egress-gateway
spec:
  hosts:
  - mtls.site
  gateways:
  - istio-egressgateway
  - mesh
  http:
  - match:
    - gateways:
      - mesh
      port: 80
    route:
    - destination:
        host: istio-egressgateway.istio-system.svc.cluster.local
        subset: nginx
        port:
          number: 443
      weight: 100
  - match:
    - gateways:
      - istio-egressgateway
      port: 443
    route:
    - destination:
        host: mtls.site
        port:
          number: 443
      weight: 100
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: originate-mtls-for-nginx
spec:
  host: mtls.site
  trafficPolicy:
    loadBalancer:
      simple: ROUND_ROBIN
    portLevelSettings:
    - port:
        number: 443
      tls:
        mode: MUTUAL
        clientCertificate: /etc/istio/nginx-client-certs/tls.crt
        privateKey: /etc/istio/nginx-client-certs/tls.key
        caCertificates: /etc/istio/nginx-ca-certs/ca-chain.cert.pem
        sni: mtls.site

creates 4 x objects

Errors

invalid path nginx certs

checking the istio-proxy container for a sleep pod inside my mesh-internal namespace:

2020-04-12T02:39:07.916502Z info    Envoy proxy is ready
[Envoy (Epoch 0)] [2020-04-12 02:40:53.418][16][warning][config] [external/envoy/source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.api.v2.Cluster rejected: Error adding/updating cluster(s) outbound|443||mtls.site: Invalid path: /etc/istio/nginx-ca-certs/ca-chain.cert.pem

invalid path /etc/certs/root-cert.pem

checking the istio-egressgateway pod I can see the following errors as well:

[Envoy (Epoch 0)] [2020-04-12 02:54:07.787][15][warning][config] [external/envoy/source/common/config/grpc_subscription_impl.cc:87] gRPC config for type.googleapis.com/envoy.api.v2.Listener rejected: Error adding/updating listener(s) 0.0.0.0_443: Invalid path: /etc/certs/root-cert.pem

Questions

Workaround / Fixes (?)

After a lot of reading through istio githubs issues and discuss.istio.io forum, I pieced together the following changes that eventually lead to a successful TLS client-verified session with my external MTLS server.

/etc/root-cert.pem fix

I changed port protocal from HTTPS

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: istio-egressgateway
spec:
  selector:
    istio: egressgateway
  servers:
  - port:
      number: 443
      name: https
      protocol: HTTPS
    hosts:
    - mtls.site
    tls:
      mode: MUTUAL
      serverCertificate: /etc/certs/cert-chain.pem
      privateKey: /etc/certs/key.pem
      caCertificates: /etc/certs/root-cert.pem

to TLS

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: istio-egressgateway
spec:
  selector:
    istio: egressgateway
  servers:
  - port:
      number: 443
      name: https
      protocol: TLS
    hosts:
    - mtls.site
    tls:
      mode: MUTUAL
      serverCertificate: /etc/certs/cert-chain.pem
      privateKey: /etc/certs/key.pem
      caCertificates: /etc/certs/root-cert.pem

And the error goes away. I'm assuming its because there's a tls section there and cert lookups get treated differently?

Invalid path: /etc/istio/nginx-ca-certs/ca-chain.cert.pem fix

For this one I came across an open issue where someone advised the sidecar of the pod calling the MTLS backend server needs to have the certs mounted to it - which sort of defeats the purpose of this "egressgateway will handle verifying calls to the backend using istio" example right?

Anyway, I did the following:

  1. created the nginx-client-certs and nginx-ca-certs secrets inside my namespace mesh-internal (where my sleep pod is deployed)
  2. added the following annotations (sidecar.istio.io/userVolumeMount and sidecar.istio.io/userVolume) to my sleep pods deployment manifest:

    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: sleep
    namespace: mesh-internal
    spec:
    replicas: 1
    selector:
      matchLabels:
        app: sleep
    template:
      metadata:
        annotations:                                                                                       
          sidecar.istio.io/userVolumeMount: '[{"name":"nginx-client-certs", "mountPath":"/etc/istio/nginx-client-certs", "readonly":true},{"name":"nginx-ca-certs", "mountPath":"/etc/istio/nginx-ca-certs", "readonly":true}]'
          sidecar.istio.io/userVolume: '[{"name":"nginx-client-certs", "secret":{"secretName":"nginx-client-certs"}},{"name":"nginx-ca-certs", "secret":{"secretName":"nginx-ca-certs"}}]'
        labels:
          app: sleep
      spec:
        serviceAccountName: sleep
        containers:
        - name: sleep
          image: governmentpaas/curl-ssl
          command: ["/bin/sleep", "3650d"]
          imagePullPolicy: IfNotPresent
          volumeMounts:
          - mountPath: /etc/sleep/tls
            name: secret-volume
        volumes:
        - name: secret-volume
          secret:
            secretName: sleep-secret
            optional: true

Now my sleep pod doesn't complain about the nginx certs anymore. I see other pods like prometheus and an httpbin pod in my mesh-internal namespace complaining about not finding the certs, but I understand (currently) it's because I haven't "sidecar mounted" these certs directly to them.

Add ServiceEntry and VirtualService

I added a ServiceEntry and VirtualService combination (it wasn't clear in the example that I needed to have one, and the previous section of the documentation delete's the ServiceEntry so the following section seems to go ahead without one and doesn't specify creating a new one?)

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: external-mtls-nginx-server
  namespace: mesh-internal
spec:
  hosts:
  - mtls.site
  ports:
  - number: 80
    name: http
    protocol: HTTP
  - number: 443
    name: https
    protocol: TLS
  resolution: DNS

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: nginx
  namespace: mesh-internal
spec:
  hosts:
  - mtls.site
  tls:
  - match:
    - port: 443
      sni_hosts:
      - mtls.site
    route:
    - destination:
        host: mtls.site
        port:
          number: 443
      weight: 100

Changed Gateway to HTTP

Changed this from tls..

---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: istio-egressgateway
spec:
  selector:
    istio: egressgateway
  servers:
  - port:
      number: 443
      name: https
      protocol: HTTPS
    hosts:
    - mtls.site
    tls:
      mode: MUTUAL
      serverCertificate: /etc/certs/cert-chain.pem
      privateKey: /etc/certs/key.pem
      caCertificates: /etc/certs/root-cert.pem

to HTTP

---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: istio-egressgateway
  namespace: mesh-internal
spec:
  selector:
    istio: egressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - mtls.site

Changed DestinationRule

from

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: egressgateway-for-nginx
spec:
  host: istio-egressgateway.istio-system.svc.cluster.local
  subsets:
  - name: nginx
    trafficPolicy:
      loadBalancer:
        simple: ROUND_ROBIN
      portLevelSettings:
      - port:
          number: 443
        tls:
          mode: ISTIO_MUTUAL
          sni: mtls.site

to

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: egressgateway-for-nginx
  namespace: mesh-internal
spec:
  host: istio-egressgateway.istio-system.svc.cluster.local
  subsets:
  - name: nginx

Changed VirtualService to port 80

Now that my Gateway is port 80, I update the following route from istio-egressgateway.istio-system.svc.cluster.local:443

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: direct-nginx-through-egress-gateway
spec:
  hosts:
  - mtls.site
  gateways:
  - istio-egressgateway
  - mesh
  http:
  - match:
    - gateways:
      - mesh
      port: 80
    route:
    - destination:
        host: istio-egressgateway.istio-system.svc.cluster.local
        subset: nginx
        port:
          number: 443
      weight: 100
  - match:
    - gateways:
      - istio-egressgateway
      port: 443
    route:
    - destination:
        host: mtls.site
        port:
          number: 443
      weight: 100

to istio-egressgateway.istio-system.svc.cluster.local:80

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: direct-nginx-through-egress-gateway
  namespace: mesh-internal
spec:
  hosts:
  - mtls.site
  gateways:
  - istio-egressgateway
  - mesh
  http:
  - match:
    - gateways:
      - mesh
      port: 80
    route:
    - destination:
        host: istio-egressgateway.istio-system.svc.cluster.local
        subset: nginx
        port:
          number: 80
      weight: 100
  - match:
    - gateways:
      - istio-egressgateway
      port: 80
    route:
    - destination:
        host: mtls.site
        port:
          number: 443
      weight: 100

And then it all works.

Working Output

So now when I curl from the sleep pod inside the mesh-internal namespace, I get the expected output:

k -n mesh-internal exec sleep-74997ffb46-cxs77 -c sleep -- curl  http://mtls.site
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to the Mutual TLS Server!</h1>

<p>If you see this page, you have successfully used the correct client-side certificates that match the ones
  deployed on this server.
</p>

<p>For more information please visit my website:<a href="https://iamronamo.io/">iamronamo.io</a>.

<p><em>Thank you and goodnight.</em></p>
</body>
</html>

From the sleep pod's istio-proxy container I can see it hitting my port 80 outbound endpoint:

[2020-04-11T15:03:16.493Z] "GET / HTTP/1.1" 200 - "-" "-" 0 557 4 4 "-" "curl/7.64.0" "738ceb49-93c9-4462-a53b-ab690bef4b93" "mtls.site" "16.0.1.90:80" outbound|80|nginx|istio-egressgateway.istio-system.svc.cluster.local 16.0.1.103:39040 52.189.232.175:80 16.0.1.103:45092 - -

and from the istio-egressgateway pod I can see it going outbound on 443:

[2020-04-11T15:04:37.866Z] "GET / HTTP/2" 200 - "-" "-" 0 557 4 4 "16.0.1.103" "curl/7.64.0" "7d158baa-3ac4-4da7-9e91-a4ae6115c090" "mtls.site" "52.189.232.175:443" outbound|443||mtls.site 16.0.1.90:43866 16.0.1.90:80 16.0.1.103:39040 - -

Conslusion

Sorry this is really long, but I don't understand how the original/current documentation was meant to work-- and my workaround acheives the objective, but functionally its limited to specific deployments that have the right annotations.

Any help understanding where I might've gone wrong would be greatly appreciated.

mkretz commented 4 years ago

@ronamosa, thanks for putting this together! We are currently struggling with exactly the same behaviour (i.e. got an mTLS origination sample to work but seeing other sidecars trying to find the certificates which are only relevant for istio-egressgateway).

cedricroijakkers commented 4 years ago

@ronamosa Could you please post the final yaml files of your working setup here? Because I'm stuck in the exact same issue, I've tried to follow along with your modifications, but I still cannot get it to work.

ronamosa commented 4 years ago

hey @cedricroijakkers I documented the whole experience here https://iamronamo.io/documentation/2020-04-08-Istio-MTLS-with-External-Endpoint/ cant find where I saved the yamls from this PoC, but I'll see if I can dig them out. Hope that helps.

pablolibo commented 4 years ago

@ronamosa thanks for your report, you just saved my job :)

cedricroijakkers commented 4 years ago

I tried following your steps, but I still keep getting errors 503. I've installed an stunnel instance now to do the ssl offloading, but still, it would be great to finally have this working in istio. Is there nobody who has a full working yaml configuration set that I can try?

petermollerud commented 4 years ago

What is the official fix from Istio on this ? There must be a lot of people wanting to get outbound MTLS from Istio to external sites working.

emike922 commented 4 years ago

Also waiting for some official statement on this. Meanwhile, has anyone tested the adapted configuration in an IPv6 environment? Last time I tried it was working on IPv4 but not working on IPv6. I need to recreate the exact issue for logs and to really make sure before I can submit a separate ticket though.

merusso commented 2 years ago

I recently set up mTLS egress with Istio and worked through a number of issues before settling on the pattern described here: https://istio.io/latest/docs/tasks/traffic-management/egress/egress-tls-origination/#mutual-tls-origination-for-egress-traffic

This config does not use an egress gateway and requires the new v1.14 DestinationRule.spec.workloadSelector, but the config is far simpler than using an egress gateway and allows us to be more selective about the mTLS client cert being used (we have different pods with different client certs connecting to the same external service).