istio / istio

Connect, secure, control, and observe services.
https://istio.io
Apache License 2.0
35.34k stars 7.63k forks source link

Istio wild card egress rule for Azure service bus common host: "www.servicebus.windows.net" not working #50027

Open sambitr opened 3 months ago

sambitr commented 3 months ago

Is this the right place to submit this?

Bug Description

I am trying to to set egress rule for the Azure Servicebus using wild card egress rule through below:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: sb
  namespace: demo
spec:
  hosts:
  - "www.servicebus.windows.net"
  exportTo: ["."]
  ports:
  - number: 5672
    name: ampq1
    protocol: TLS
  - number: 5671
    name: ampq2
    protocol: TLS
  - number: 443
    name: tls
    protocol: TLS
  resolution: DNS
  location: MESH_EXTERNAL
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: egressgateway-for-sb
  namespace: demo
spec:
  host: istio-egressgateway.istio-system.svc.cluster.local
  exportTo: ["."]
  subsets:
    - name: sb
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: direct-sb-through-egress-gateway
  namespace: demo
spec:
  hosts:
    - "*.servicebus.windows.net"
  exportTo: ["."]
  gateways:
  - mesh
  - istio-system/istio-egressgateway
  tls:
  - match:
    - gateways:
      - mesh
      port: 443
      sniHosts:
        - "*.servicebus.windows.net"
    route:
    - destination:
        host: istio-egressgateway.istio-system.svc.cluster.local
        subset: sb
        port:
          number: 443
      weight: 100
  - match:
    - gateways:
      - istio-system/istio-egressgateway
      port: 443
      sniHosts:
      - "*.servicebus.windows.net"
    route:
    - destination:
        host: "www.servicebus.windows.net"
        port:
          number: 443
      weight: 100

Similar setup is done on our application namespace.

Istio-proxy sidecar logs show proper connection generation and redirection to the istio-egressgateway:

[2024-03-21T12:16:02.079Z] "- - -" 0 - - - "-" 331 0 3 - "-" "-" "-" "-" "172.0.1.11:8443" outbound|443|sb|istio-egressgateway.istio-system.svc.cluster.local 172.0.3.31:58826 94.245.88.192:443 172.0.3.31:35220 demo-dev-bus.servicebus.windows.net -

I am getting UH(Upstream connect error) code in the istio-egressgateway pod logs:

[2024-03-21T12:17:15.359Z] "- - -" 0 UH - - "-" 0 0 0 - "-" "-" "-" "-" "-" outbound|443||www.servicebus.windows.net - 172.0.1.11:8443 172.0.3.31:42448 demo-dev-bus.servicebus.windows.net -

I followed the same approach for Azure Key Vault and Storage accounts. both worked fine with hosts: "www.vault.azure.net" and "www.blob.core.windows.net" respectively.

Can someone help me here as to what is missing in the config that is blocking the Service Bus only?

Version

$ istioctl version
client version: 1.11.8
control plane version: 1.19.6
data plane version: 1.19.6 (7 proxies)

$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-03T13:46:05Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"windows/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.7", GitCommit:"55a7e688f9220adca1c99b7903953911dd38b771", GitTreeState:"clean", BuildDate:"2023-11-03T12:18:23Z", GoVersion:"go1.20.10", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.24) and server (1.27) exceeds the supported minor version skew of +/-1

skrout@Sambit MINGW64 ~

Additional Information

No response

keithmattix commented 3 months ago

Maybe try an HTTPS protocol for port 443?

sambitr commented 3 months ago

It makes no difference:

[2024-03-21T13:38:53.273Z] "- - -" 0 UH - - "-" 0 0 0 - "-" "-" "-" "-" "-" outbound|443||www.servicebus.windows.net - 172.0.0.39:8443 172.0.0.14:41004 demo-dev-bus.servicebus.windows.net -

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: sb
  namespace: demo
spec:
  hosts:
  - "www.servicebus.windows.net"
  exportTo: ["."]
  ports:
  - number: 5672
    name: ampq1
    protocol: TLS
  - number: 5671
    name: ampq2
    protocol: TLS
  - number: 443
    name: https
    protocol: HTTPS
  resolution: DNS
keithmattix commented 3 months ago

Can you turn on debug/trace logs for the client sidecar to see if you get more hints at the particular error?

sambitr commented 3 months ago

umm.. I am not sure how to do that. Do you have any link for that, so I can implement the same?

keithmattix commented 3 months ago

First, upgrade your istioctl version so that it'll be compatible with recent versions. Then run: istioctl pc log POD_NAME --level debug

sambitr commented 3 months ago

One thing to observe is that I have added the servicebus IPs tot he excludeIPRange parameter and I can see them reflect in the config map too under istio-system. That means this should no go through the istio-proxy side car too.

But I can see the entry both in istio-proxy side car container log as well as in the istio-egressgateway log in istio-system

Also, just observed one thing. The common host resolution for service bus: www.servicebus.windows.net is not not working, where as it resolves to proper address for other resources.

skrout@Sambit MINGW64 ~
$ nslookup www.servicebus.windows.net
*** UnKnown can't find www.servicebus.windows.net: Non-existent domain
Server:  UnKnown
Address:  10.1.0.4

skrout@Sambit MINGW64 ~
$ nslookup www.vault.azure.net
Non-authoritative answer:
Server:  UnKnown
Address:  10.1.0.4

Name:    azkms-prod-eus-a.eastus.cloudapp.azure.com
Address:  20.62.134.229
Aliases:  www.vault.azure.net
          data-prod-eus.vaultcore.azure.net
          data-prod-eus-region.vaultcore.azure.net

I was just checking the istio-egress pod logs and see the same difference there too:

For KV's case, it resolves the common host name. But not for the SB

[2024-03-22T08:26:55.347Z] "- - -" 0 UH - - "-" 0 0 0 - "-" "-" "-" "-" "-" outbound|443||www.servicebus.windows.net - 172.0.0.41:8443 172.0.2.39:54428 dts-demo-dev-bus.servicebus.windows.net -

[2024-03-22T08:35:01.017Z] "- - -" 0 - - - "-" 1254 8139 353 - "-" "-" "-" "-" "20.62.134.229:443" outbound|443||www.vault.azure.net 172.0.0.41:58384 172.0.0.41:8443 172.0.2.39:33964 dts-demo-dev-kv.vault.azure.net -

MengjiaLiang commented 1 month ago

I think you should try to use TLS origination - https://istio.io/latest/docs/tasks/traffic-management/egress/egress-tls-origination/

Based on my limited understanding here, Egress is not like Ingress.

For Ingress, you define the gateway with 443 and provide the TLS cert and private key because you are owning the domain. So ingress gateway can do the TLS termination.

For Egress, the domain is owned by others. There's no way for Egress gateway to achieve the TLS termination as you cannot provide the TLS cert and key.

When your workload talks to outside through HTTPS normally, the sidecar basically bypasses it directly and forwards it to mesh external(public network) since it is not in the gateway mode. When this request is redirected to the Egress gateway, it cannot be intercepted by the gateway due to the reason I mentioned above.

Ideally, the workflow will be like Workload -----http-----> Egress Gateway -----https-----> www.vault.azure.net

istio-policy-bot commented 1 week ago

🧭 This issue or pull request has been automatically marked as stale because it has not had activity from an Istio team member since 2024-03-21. It will be closed on 2024-07-04 unless an Istio team member takes action. Please see this wiki page for more information. Thank you for your contributions.

Created by the issue and PR lifecycle manager.