Closed kupmanyu closed 8 months ago
Seems a bit similar to a recent fix https://github.com/istio/istio/commit/298b831de609af33a13762158d3e5e3c805600f6 -- this was about missing endpoints in the cluster though, not the entire cluster missing. In any event, I suspect its a merging issue around the ServiceEntries, need to look into this. cc @hzxuzhonghu
BTW, why do you use a ServiceEntry to point to a Service?
can you check all your SEs with keycloak-http.idp.svc.cluster.local
, and do you also have a k8s service with keycloak-http
I suspect this has something to do with service merge.
BTW, why do you use a ServiceEntry to point to a Service?
We have a ServiceEntry
pointing to a Service as we run Keycloak outside the mesh and have the outbound traffic policy set to REGISTRY_ONLY
. Without creating a ServiceEntry
, we are routed to BlackHoleCluster
when we try to access Keycloak from any pod in the mesh.
can you check all your SEs with
keycloak-http.idp.svc.cluster.local
, and do you also have a k8s service withkeycloak-http
Yeah, we have a k8s service with the name keycloak-http
.
The only SEs we have for keycloak-http.idp.svc.cluster.local
are the ones in istio-ingress
ns:
$ k get se -A | rg "keycloak-http"
istio-ingress keycloak-ext ["keycloak-http.idp.svc.cluster.local"] MESH_EXTERNAL DNS 6d19h
istio-ingress keycloak-http-ext ["keycloak-http.idp.svc.cluster.local"] MESH_EXTERNAL DNS 6d19h
Here is the yaml for it:
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: keycloak-ext
namespace: istio-ingress
spec:
exportTo:
- '*'
hosts:
- keycloak-http.idp.svc.cluster.local
location: MESH_EXTERNAL
ports:
- name: tcp-keycloak-port
number: 8443
protocol: TCP
resolution: DNS
---
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: keycloak-http-ext
namespace: istio-ingress
spec:
exportTo:
- '*'
hosts:
- keycloak-http.idp.svc.cluster.local
location: MESH_EXTERNAL
ports:
- name: http
number: 80
protocol: HTTP
resolution: DNS
Here is the keycloak-http
k8s service:
apiVersion: v1
kind: Service
metadata:
name: keycloak-http
namespace: idp
spec:
type: ClusterIP
ports:
- name: http
port: 80
targetPort: http
protocol: TCP
- name: https
port: 8443
targetPort: https
protocol: TCP
selector:
app.kubernetes.io/instance: keycloak
Let me know if you need any more details regarding this.
We have a
ServiceEntry
pointing to a Service as we run Keycloak outside the mesh and have the outbound traffic policy set toREGISTRY_ONLY
. Without creating aServiceEntry
, we are routed toBlackHoleCluster
when we try to access Keycloak from any pod in the mesh
This doesn't sound right... maybe you are just filtering it with a Sidecar object . The service doesn't need to be "in the mesh" (I assume this means sidecar injected?) to be reachable.
If it's desired than ok, but seems unexpected and will give undesirable behavior
This doesn't sound right... maybe you are just filtering it with a Sidecar object . The service doesn't need to be "in the mesh" (I assume this means sidecar injected?) to be reachable.
If it's desired than ok, but seems unexpected and will give undesirable behavior
Yeah, by "in the mesh" I meant sidecar injected. Sorry if that wasn't clear.
Strangely, that is not the behavior we observed. We have added the namespace that has Keycloak running in it to the Sidecar
object present in the shared
(where the request originates from) namespace, but we are still routed to BlackHoleCluster
until we add the ServiceEntry
.
Here is the Sidecar
object we have present in the namespace:
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
name: shared-sidecar-ns-wide
namespace: shared
spec:
egress:
- hosts:
- istio-ingress/*
- istio-system/*
- idp/*
- shared/*
The sidecar namespace is "shared" not "client" which may be why?
Whether the destination is sidecar injected is not relevant to whether it can be reached
Ah sorry, I meant shared
namespace and accidentally wrote client
. I'll update the comment.
I see. Checking istioctl pc c
from those pods may help to see what is visible.
There's actually a few possible controls beyond even sidecar - discoverySelectors and exportTo. Which could be the cause.. or something more obscure
Ah, I think I got it, the cause does seem to be discoverySelectors
. We have the below selector in our Istiod config
discoverySelectors:
- matchLabels:
istio-discovery: enabled
The label isn't added to the idp
namespace.
After adding it, I am able to access the service without needing an SE.
Thanks for the help :)
NP. I think the original issue is still valid so we should keep this open .
Fwiw it could also probably be fixed by merging the 2 SEs into a single one.
@kupmanyu please check if there is log simliar as STRICT_DNS cluster without endpoints outbound|443||www.google.co.uk found while pushing CDS
in istiod. But if this is the cause, you should always get 503 NC cluster_not_found Error
I can see there is a log similar to that present in istiod logs:
$ k logs istiod-1-20-3-564c468b4-txsmr discovery | rg "STRICT_DNS cluster"
"message": "STRICT_DNS cluster without endpoints outbound|80||keycloak-http.ipd.svc.cluster.local found while pushing CDS"
"message": "STRICT_DNS cluster without endpoints outbound|80||keycloak-http.ipd.svc.cluster.local found while pushing CDS"
"message": "STRICT_DNS cluster without endpoints outbound|80||keycloak-http.ipd.svc.cluster.local found while pushing CDS"
FYI https://github.com/istio/istio/issues/49489, this is fixed
FYI #49489, this is fixed
Thanks for the update.
When is the expected patch release with this fix in 1.20
?
not sure when is next release
+1
@hzxuzhonghu Is there any timeline for the next release which includes this fix.
@hanxiaop When is the next 1.20 patch release?
@hanxiaop When is the next 1.20 patch release?
Should be next Tuesday if the e2e tests, which are currently running, are passing.
Thanks for the update @hanxiaop and @hzxuzhonghu
@hanxiaop I am seeing 1.21 version got release. Is there any target date in 1.20.X patch release ?
@lkalaivanan Will be today or tomorrow. The patch versions were scheduled to be released after 1.21's release.
Is this the right place to submit this?
Bug Description
With no changes made to
istio
config, we observed different behaviour on different days.A request that was working on
2024-03-01
now returns503 NC cluster_not_found
error on2024-03-04
.Before
After
The only action that was taken was the cluster was scaled down to 0 nodes during the weekend and then scaled up again after.
Version
Additional Information
Configuration
Istiod Config ```yaml accessLogFile: /dev/stdout defaultConfig: discoveryAddress: istiod-1-20-3.istio-system.svc:15012 holdApplicationUntilProxyStarts: true image: imageType: distroless proxyMetadata: ISTIO_META_DNS_AUTO_ALLOCATE: "true" ISTIO_META_DNS_CAPTURE: "true" defaultProviders: metrics: - prometheus discoverySelectors: - matchLabels: istio-discovery: enabled enablePrometheusMerge: true outboundTrafficPolicy: mode: REGISTRY_ONLY rootNamespace: istio-system trustDomain: cluster.local ``` Service Entries ```yaml apiVersion: networking.istio.io/v1beta1 kind: ServiceEntry metadata: name: keycloak-ext namespace: istio-ingress spec: exportTo: - '*' hosts: - keycloak-http.idp.svc.cluster.local location: MESH_EXTERNAL ports: - name: tcp-keycloak-port number: 8443 protocol: TCP resolution: DNS --- apiVersion: networking.istio.io/v1beta1 kind: ServiceEntry metadata: name: keycloak-http-ext namespace: istio-ingress spec: exportTo: - '*' hosts: - keycloak-http.idp.svc.cluster.local location: MESH_EXTERNAL ports: - name: http number: 80 protocol: HTTP resolution: DNS ``` Port 80 Listener for Pod ```json [ { "name": "0.0.0.0_80", "address": { "socketAddress": { "address": "0.0.0.0", "portValue": 80 } }, "filterChains": [ { "filterChainMatch": { "transportProtocol": "raw_buffer", "applicationProtocols": [ "http/1.1", "h2c" ] }, "filters": [ { "name": "envoy.filters.network.http_connection_manager", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager", "statPrefix": "outbound_0.0.0.0_80", "rds": { "configSource": { "ads": {}, "initialFetchTimeout": "0s", "resourceApiVersion": "V3" }, "routeConfigName": "80" }, "httpFilters": [ { "name": "istio.metadata_exchange", "typedConfig": { "@type": "type.googleapis.com/udpa.type.v1.TypedStruct", "typeUrl": "type.googleapis.com/io.istio.http.peer_metadata.Config", "value": { "upstream_discovery": [ { "istio_headers": {} }, { "workload_discovery": {} } ], "upstream_propagation": [ { "istio_headers": {} } ] } } }, { "name": "envoy.filters.http.grpc_stats", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.http.grpc_stats.v3.FilterConfig", "emitFilterState": true, "statsForAllMethods": false } }, { "name": "istio.alpn", "typedConfig": { "@type": "type.googleapis.com/istio.envoy.config.filter.http.alpn.v2alpha1.FilterConfig", "alpnOverride": [ { "alpnOverride": [ "istio-http/1.0", "istio", "http/1.0" ] }, { "upstreamProtocol": "HTTP11", "alpnOverride": [ "istio-http/1.1", "istio", "http/1.1" ] }, { "upstreamProtocol": "HTTP2", "alpnOverride": [ "istio-h2", "istio", "h2" ] } ] } }, { "name": "envoy.filters.http.fault", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault" } }, { "name": "envoy.filters.http.cors", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors" } }, { "name": "istio.stats", "typedConfig": { "@type": "type.googleapis.com/stats.PluginConfig" } }, { "name": "envoy.filters.http.router", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.http.router.v3.Router" } } ], "tracing": { "clientSampling": { "value": 100 }, "randomSampling": { "value": 10 }, "overallSampling": { "value": 100 }, "customTags": [ { "tag": "istio.authorization.dry_run.allow_policy.name", "metadata": { "kind": { "request": {} }, "metadataKey": { "key": "envoy.filters.http.rbac", "path": [ { "key": "istio_dry_run_allow_shadow_effective_policy_id" } ] } } }, { "tag": "istio.authorization.dry_run.allow_policy.result", "metadata": { "kind": { "request": {} }, "metadataKey": { "key": "envoy.filters.http.rbac", "path": [ { "key": "istio_dry_run_allow_shadow_engine_result" } ] } } }, { "tag": "istio.authorization.dry_run.deny_policy.name", "metadata": { "kind": { "request": {} }, "metadataKey": { "key": "envoy.filters.http.rbac", "path": [ { "key": "istio_dry_run_deny_shadow_effective_policy_id" } ] } } }, { "tag": "istio.authorization.dry_run.deny_policy.result", "metadata": { "kind": { "request": {} }, "metadataKey": { "key": "envoy.filters.http.rbac", "path": [ { "key": "istio_dry_run_deny_shadow_engine_result" } ] } } }, { "tag": "istio.canonical_revision", "literal": { "value": "latest" } }, { "tag": "istio.canonical_service", "literal": { "value": "busybox" } }, { "tag": "istio.mesh_id", "literal": { "value": "cluster.local" } }, { "tag": "istio.namespace", "literal": { "value": "shared" } } ] }, "streamIdleTimeout": "0s", "accessLog": [ { "name": "envoy.access_loggers.file", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog", "path": "/dev/stdout", "logFormat": { "textFormatSource": { "inlineString": "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %RESPONSE_CODE_DETAILS% %CONNECTION_TERMINATION_DETAILS% \"%UPSTREAM_TRANSPORT_FAILURE_REASON%\" %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\" %UPSTREAM_CLUSTER% %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS% %REQUESTED_SERVER_NAME% %ROUTE_NAME%\n" } } } } ], "useRemoteAddress": false, "upgradeConfigs": [ { "upgradeType": "websocket" } ], "normalizePath": true, "pathWithEscapedSlashesAction": "KEEP_UNCHANGED", "requestIdExtension": { "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.request_id.uuid.v3.UuidRequestIdConfig", "useRequestIdForTraceSampling": true } } } } ] } ], "defaultFilterChain": { "filterChainMatch": {}, "filters": [ { "name": "istio.stats", "typedConfig": { "@type": "type.googleapis.com/stats.PluginConfig" } }, { "name": "envoy.filters.network.tcp_proxy", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy", "statPrefix": "BlackHoleCluster", "cluster": "BlackHoleCluster", "accessLog": [ { "name": "envoy.access_loggers.file", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog", "path": "/dev/stdout", "logFormat": { "textFormatSource": { "inlineString": "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %RESPONSE_CODE_DETAILS% %CONNECTION_TERMINATION_DETAILS% \"%UPSTREAM_TRANSPORT_FAILURE_REASON%\" %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\" %UPSTREAM_CLUSTER% %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS% %REQUESTED_SERVER_NAME% %ROUTE_NAME%\n" } } } } ] } } ], "name": "PassthroughFilterChain" }, "listenerFilters": [ { "name": "envoy.filters.listener.tls_inspector", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector" } }, { "name": "envoy.filters.listener.http_inspector", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.listener.http_inspector.v3.HttpInspector" } } ], "listenerFiltersTimeout": "0s", "continueOnListenerFiltersTimeout": true, "trafficDirection": "OUTBOUND", "bindToPort": false } ] ``` Port 8443 Listener ```json [ { "name": "240.240.52.207_8443", "address": { "socketAddress": { "address": "240.240.52.207", "portValue": 8443 } }, "filterChains": [ { "filters": [ { "name": "istio.stats", "typedConfig": { "@type": "type.googleapis.com/stats.PluginConfig" } }, { "name": "envoy.filters.network.tcp_proxy", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy", "statPrefix": "outbound|8443||keycloak-http.idp.svc.cluster.local", "cluster": "outbound|8443||keycloak-http.idp.svc.cluster.local", "accessLog": [ { "name": "envoy.access_loggers.file", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog", "path": "/dev/stdout", "logFormat": { "textFormatSource": { "inlineString": "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %RESPONSE_CODE_DETAILS% %CONNECTION_TERMINATION_DETAILS% \"%UPSTREAM_TRANSPORT_FAILURE_REASON%\" %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\" %UPSTREAM_CLUSTER% %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS% %REQUESTED_SERVER_NAME% %ROUTE_NAME%\n" } } } } ] } } ] } ], "listenerFiltersTimeout": "0s", "continueOnListenerFiltersTimeout": true, "trafficDirection": "OUTBOUND", "bindToPort": false } ] ``` Route to Service on Port 80 ```json { "name": "keycloak-http.idp.svc.cluster.local:80", "domains": [ "keycloak-http.idp.svc.cluster.local", "keycloak-http.idp", "keycloak-http.idp.svc", "240.240.52.207" ], "routes": [ { "name": "default", "match": { "prefix": "/" }, "route": { "cluster": "outbound|80||keycloak-http.idp.svc.cluster.local", "timeout": "0s", "retryPolicy": { "retryOn": "connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes", "numRetries": 2, "retryHostPredicate": [ { "name": "envoy.retry_host_predicates.previous_hosts", "typedConfig": { "@type": "type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate" } } ], "hostSelectionRetryMaxAttempts": "5", "retriableStatusCodes": [ 503 ] }, "maxGrpcTimeout": "0s" }, "decorator": { "operation": "keycloak-http.idp.svc.cluster.local:80/*" } } ], "includeRequestAttemptCount": true } ``` No Cluster for Service on Port 80 ```shell $ istioctl proxy-config cluster busybox --direction outbound --port 80 --fqdn "keycloak-http.idp.svc.cluster.local" -o json [] ``` Cluster for Service on Port 8443 ```shell $ istioctl proxy-config cluster busybox --direction outbound --port 8443 --fqdn "keycloak-http.idp.svc.cluster.local" -o json [ { "name": "outbound|8443||keycloak-http.idp.svc.cluster.local", "type": "STRICT_DNS", "connectTimeout": "10s", "lbPolicy": "LEAST_REQUEST", "loadAssignment": { "clusterName": "outbound|8443||keycloak-http.idp.svc.cluster.local", "endpoints": [ { "locality": {}, "lbEndpoints": [ { "endpoint": { "address": { "socketAddress": { "address": "keycloak-http.idp.svc.cluster.local", "portValue": 8443 } } }, "metadata": { "filterMetadata": { "istio": { "workload": ";;;;" } } }, "loadBalancingWeight": 1 } ], "loadBalancingWeight": 1 } ] }, "circuitBreakers": { "thresholds": [ { "maxConnections": 4294967295, "maxPendingRequests": 4294967295, "maxRequests": 4294967295, "maxRetries": 4294967295, "trackRemaining": true } ] }, "dnsRefreshRate": "60s", "respectDnsTtl": true, "dnsLookupFamily": "V4_ONLY", "commonLbConfig": { "localityWeightedLbConfig": {} }, "metadata": { "filterMetadata": { "istio": { "external": true, "services": [ { "host": "keycloak-http.idp.svc.cluster.local", "name": "keycloak-http.idp.svc.cluster.local", "namespace": "istio-ingress" } ] } } }, "filters": [ { "name": "istio.metadata_exchange", "typedConfig": { "@type": "type.googleapis.com/udpa.type.v1.TypedStruct", "typeUrl": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange", "value": { "enable_discovery": true, "protocol": "istio-peer-exchange" } } } ] } ] ```