kumahq / kuma

🐻 The multi-zone service mesh for containers, Kubernetes and VMs. Built with Envoy. CNCF Sandbox Project.
https://kuma.io/install
Apache License 2.0
3.63k stars 331 forks source link

ExternalServices from different zones doesn't work on ZoneEgress #6391

Closed lobkovilya closed 1 week ago

lobkovilya commented 1 year ago

There are 2 reasons why it doesn't work:

1. Mixing up hostnames and IPs in the Envoy cluster

If we have 2 external services:

type: ExternalService
name: es-1
networking:
  address: zone-1.external.service:80
tags:
  kuma.io/service: es
  kuma.io/zone: zone-1
---
type: ExternalService
name: es-2
networking:
  address: zone-2.external.service:80
tags:
  kuma.io/service: es
  kuma.io/zone: zone-2

For ZoneEgress in zone-1 Kuma creates EDS cluster es that contains 2 endpoints zone-1.external.service:80 and <zone-ingress-in-zone-2-IP>. Envoy can't resolve hostnames for EDS clusters, that's why configuration can't be applied on Envoy.

2. Missing TLS termination before sending a request to ExternalService

Even if in the previous example we replace addresses with IPs:

type: ExternalService
name: es-1
networking:
  address 192.168.0.1:80
tags:
  kuma.io/service: es
  kuma.io/zone: zone-1
---
type: ExternalService
name: es-2
networking:
  address: 192.168.0.2:80
tags:
  kuma.io/service: es
  kuma.io/zone: zone-2

For ZoneEgress in zone-1 Kuma creates FilterChain that doesn't terminate TLS:

filter_chains:
- filter_chain_match:
    transport_protocol: tls
    server_names:
    - es{mesh=mesh-1}
  filters:
  - name: envoy.filters.network.tcp_proxy
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
      stat_prefix: mesh-1_es
      cluster: mesh-1:es
      metadata_match:
        filter_metadata:
          envoy.lb:
            mesh: mesh-1

But if we remove es-2 then FilterChain for es-1 will be correct:

filter_chains:
- filter_chain_match:
    transport_protocol: tls
    server_names:
    - es{mesh=mesh-1}
  filters:
  - name: envoy.filters.network.rbac
    typed_config: "..."
  - name: envoy.filters.network.http_connection_manager
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
      stat_prefix: es
      route_config:
        name: outbound:es
        virtual_hosts:
        - name: es
          domains:
          - "*"
          routes:
          - match:
              prefix: "/"
            route:
              cluster: mesh-1:es
              timeout: 0s
        validate_clusters: false
      http_filters:
      - name: envoy.filters.http.router
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
  transport_socket:
    name: envoy.transport_sockets.tls
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
      common_tls_context:
        tls_certificate_sds_secret_configs:
        - name: identity_cert:secret:mesh-1
          sds_config:
            ads: {}
            resource_api_version: V3
        combined_validation_context:
          default_validation_context:
            match_typed_subject_alt_names:
            - san_type: URI
              matcher:
                prefix: spiffe://mesh-1/
          validation_context_sds_secret_config:
            name: mesh_ca:secret:mesh-1
            sds_config:
              ads: {}
              resource_api_version: V3
      require_client_certificate: true
github-actions[bot] commented 1 year ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 1 year ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

lahabana commented 1 year ago

@lobkovilya and @jakubdyszkiewicz is this still a thing with the recent changes to zoneEgress/zoneIngress?

lobkovilya commented 1 year ago

As far as I know, it's still an issue. I don't think the EDS resolver or TLS termination part was changed recently.

github-actions[bot] commented 9 months ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 6 months ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 3 months ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

lahabana commented 1 week ago

Not to be fixed we'll focused on MeshExternalService