Closed chandan9778 closed 2 years ago
This looks like it could be related to this nginx ingress controller bug that was resolved in this nginx ingress controller PR. This fix is included in nginx ingress controller v1.1.1.
Can you look at your nginx ingress controller logs? Do you see SSL certificate expired errors?
@trstringer Thanks for your response but currently am using nginx ingress controller v1.1.1
Can you provide nginx ingress controller error log entries pertaining to these failures?
@chandan9778 Have you looked at https://release-v1-0.docs.openservicemesh.io/docs/demos/ingress_k8s_nginx/? If not, please take a look to confirm you have applied the ingress configurations correctly.
@trstringer 796 peer closed connection in SSL handshake (104: Connection reset by peer) while SSL handshaking to upstream, client: xx.xx.xx.xx(ip), server: xyz.dns.com, request: "GET /podname HTTP/2.0", upstream: "https://xx.xx.xx.xx:8080/", host: "xyz.dns.com"
@shashankram yes applied the configuration correctly as per the docs ,The error am getting only when enabling osm for any of the ingress based application.
@chandan9778 Please share the following YAML configurations (redact info where necessary):
@shashankram please find above mentioned yamls for your reference ##INGRESSBACKEND.YAML###
apiVersion: policy.openservicemesh.io/v1alpha1
kind: IngressBackend
metadata:
name: xyz
namespace: namespace-name
spec:
backends:
- name: xyz-service
port:
number: 8080 # targetPort of httpbin service
protocol: https
tls:
skipClientCertValidation: false
sources:
- kind: Service
name: ingress-nginx-controller
namespace: ingress_namespace
- kind: AuthenticatedPrincipal
name: ingress-nginx.ingress_namespace.cluster.local
###SERVICE.YAML####
apiVersion: v1
kind: Service
metadata:
name: xyz
namespace: namespace_name
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 8080
selector:
k8s-app: pod_selector_name
###INGRESS.YAML###
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: xyz-ingress
namespace: nmaesppace_name
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/force-ssl-redirect: "false"
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/use-regex: "true"
ingress.kubernetes.io/tls-minimum-version: "1.2"
kubernetes.io/ingress.allow-http: "false"
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/configuration-snippet: |
proxy_ssl_name "default.application_specific_namesapce_name.cluster.local";
nginx.ingress.kubernetes.io/proxy-ssl-secret: kube-system/osm-nginx-client-cert
nginx.ingress.kubernetes.io/proxy-ssl-verify: on
spec:
tls:
- hosts:
- dummyingresshostname
secretName: dummyingresssecretname
rules:
- host: dummyingresshostname
http:
paths:
- path: /podname(/|$)(.*)
backend:
serviceName: service_name
servicePort: 80 ``
@shashankram please find above mentioned yamls for your reference ##INGRESSBACKEND.YAML### apiVersion: policy.openservicemesh.io/v1alpha1 kind: IngressBackend metadata: name: xyz namespace: namespace-name spec: backends:
- name: xyz-service port: number: 8080 # targetPort of httpbin service protocol: https tls: skipClientCertValidation: false sources:
- kind: Service name: ingress-nginx-controller namespace: ingress_namespace
- kind: AuthenticatedPrincipal name: ingress-nginx.ingress_namespace.cluster.local
###SERVICE.YAML####
apiVersion: v1 kind: Service metadata: name: xyz namespace: namespace_name spec: type: ClusterIP ports:
- port: 80 targetPort: 8080 selector: k8s-app: pod_selector_name
###INGRESS.YAML###
apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: xyz-ingress namespace: nmaesppace_name annotations: kubernetes.io/ingress.class: nginx nginx.ingress.kubernetes.io/force-ssl-redirect: "false" nginx.ingress.kubernetes.io/rewrite-target: /$2 nginx.ingress.kubernetes.io/use-regex: "true" ingress.kubernetes.io/tls-minimum-version: "1.2" kubernetes.io/ingress.allow-http: "false" nginx.ingress.kubernetes.io/backend-protocol: HTTPS nginx.ingress.kubernetes.io/configuration-snippet: | proxy_ssl_name "default.application_specific_namesapce_name.cluster.local"; nginx.ingress.kubernetes.io/proxy-ssl-secret: kube-system/osm-nginx-client-cert nginx.ingress.kubernetes.io/proxy-ssl-verify: on
spec: tls:
hosts:
- dummyingresshostname secretName: dummyingresssecretname rules:
- host: dummyingresshostname http: paths:
- path: /podname(/|$)(.*) backend: serviceName: service_name servicePort: 80
@chandan9778 thanks, could you kindly format the snippet, see https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code
Also would you mind sharing the MeshConfig YAML: kubectl get meshconfig osm-mesh-config -n <osm namespace> -o yaml
@shashankram modified the above code snippet as per the standard and also please find the osm-messh-config yaml.
`apiVersion: config.openservicemesh.io/v1alpha1
kind: MeshConfig
metadata:
creationTimestamp: "2022-03-08T11:32:10Z"
generation: 5
name: osm-mesh-config
namespace: kube-system
resourceVersion: "xxx"
uid: xxx
spec:
certificate:
certKeyBitSize: 2048
ingressGateway:
secret:
name: dummy-secret
namespace: kube-system
subjectAltNames:
- ingress-nginx.ingress_namespace_name-ns.cluster.local
validityDuration: 24h
serviceCertValidityDuration: 24h
featureFlags:
enableAsyncProxyServiceMapping: false
enableEgressPolicy: true
enableEnvoyActiveHealthChecks: false
enableIngressBackendPolicy: true
enableMulticlusterMode: false
enableRetryPolicy: false
enableSnapshotCacheMode: false
enableWASMStats: true
observability:
enableDebugServer: true
osmLogLevel: info
tracing:
enable: false
sidecar:
configResyncInterval: 0s
enablePrivilegedInitContainer: false
logLevel: error
resources: {}
traffic:
enableEgress: true
enablePermissiveTrafficPolicyMode: false
inboundExternalAuthorization:
enable: false
failureModeAllow: false
statPrefix: inboundExtAuthz
timeout: 1s
inboundPortExclusionList: []
outboundIPRangeExclusionList: []
outboundPortExclusionList: []`
In MeshConfig, subjectAltNames
has ingress-nginx.ingress_namespace_name-ns.cluster.local
, whereas the IngressBackend has the AuthenticatedPrincipal
specified as ingress-nginx.ingress_namespace.cluster.local
. This could be a typo, but please ensure the configurations shared are accurate. In this case, they need to match.
Also kindly walkthrough the HTTPS ingress demo https://release-v1-0.docs.openservicemesh.io/docs/demos/ingress_k8s_nginx/#https-ingress-mtls-and-tls verbatim and confirm that it works. If that works, there's likely an issue with the Nginx version you are using or a misconfiguration on your end.
@shashankram yes you are correct that is a typo both are same in my case and also have followed the exact same steps as mentioned in the URL. https://release-v1-0.docs.openservicemesh.io/docs/demos/ingress_k8s_nginx/#https-ingress-mtls-and-tls Still am getting upstream connect error or disconnect/reset before headers. reset reason: connection failure
@shashankram yes you are correct that is a typo both are same in my case and also have followed the exact same steps as mentioned in the URL. https://release-v1-0.docs.openservicemesh.io/docs/demos/ingress_k8s_nginx/#https-ingress-mtls-and-tls Still am getting upstream connect error or disconnect/reset before headers. reset reason: connection failure
@chandan9778 Just to confirm, are you suggesting following the steps exactly as provided in https://release-v1-0.docs.openservicemesh.io/docs/demos/ingress_k8s_nginx/#https-ingress-mtls-and-tls do not work for you? Could you share how you installed Nginx?
@shashankram yes you are correct that is a typo both are same in my case and also have followed the exact same steps as mentioned in the URL. https://release-v1-0.docs.openservicemesh.io/docs/demos/ingress_k8s_nginx/#https-ingress-mtls-and-tls Still am getting upstream connect error or disconnect/reset before headers. reset reason: connection failure
@chandan9778 Just to confirm, are you suggesting following the steps exactly as provided in https://release-v1-0.docs.openservicemesh.io/docs/demos/ingress_k8s_nginx/#https-ingress-mtls-and-tls do not work for you? Could you share how you installed Nginx?
Yes the above steps did not worked for me. Follwed this yaml to install nginx into my cluster. https://github.com/kubernetes/ingress-nginx/blob/main/deploy/static/provider/cloud/deploy.yaml
@chandan9778, do you also mind trying the HTTP based ingress workflow and verifying if that works for you? If HTTP works, we can upgrade the configuration to HTTPS and debug what's going on. If HTTP ingress doesn't work, it means there's a misconfiguration or something basic that isn't working.
Also please share the Envoy log from the backend pod taken while the request to the backend fails.
@shashankram tried HTTP based ingress workflow still getting the same [upstream connect error or disconnect/reset before headers. reset reason: connection failure] Also please find envoy logs for your reference.
{"time_to_first_byte":null,"authority":"xyz.dns.com","response_code":503,"upstream_service_time":null,"upstream_cluster":"namespace_name/servicename|8080|local","bytes_sent":91,"upstream_host":"x.x.x.x:8080","protocol":"HTTP/1.1","response_code_details":"upstream_reset_before_response_started{connection_failure}","requested_server_name":null,"user_agent":"Safari/xx.xx","path":"/","duration":0,"bytes_received":0,"response_flags":"UF","request_id":"xxxxxxxxx","method":"GET","start_time":"2022-03-11T10:55:12.990Z","x_forwarded_for":"xx.x.x.x"}
{"start_time":"2022-03-11T10:55:13.360Z","request_id":"xxxxxxxxxxxxxx","duration":0,"x_forwarded_for":"xx.x.x.x","authority":"xyz.dns.com","upstream_service_time":null,"user_agent":"Safari/xxx.36","response_flags":"UF","response_code_details":"upstream_reset_before_response_started{connection_failure}","time_to_first_byte":null,"upstream_host":"x.x.x.x:8080","requested_server_name":null,"method":"GET","protocol":"HTTP/1.1","bytes_received":0,"response_code":503,"upstream_cluster":"nnamespce_name/service_name|8080|local","bytes_sent":91,"path":"/"}
{"path":"/","bytes_sent":91,"response_code":503,"bytes_received":0,"response_flags":"UF","response_code_details":"upstream_reset_before_response_started{connection_failure}","protocol":"HTTP/1.1","upstream_cluster":"namespace_name/service_name|8080|local","upstream_service_time":null,"request_id":"xxxxxxxxxxxxxxxxxxxxxx","x_forwarded_for":"xx.x.x.x","start_time":"2022-03-11T10:55:15.677Z","duration":0,"method":"GET","requested_server_name":null,"time_to_first_byte":null,"upstream_host":"x.x.x.x:8080","authority":"xyz.dns.com","user_agent":"
{"x_forwarded_for":"xx.x.x.x","upstream_cluster":"namespace_name/service_name|8080|local","protocol":"HTTP/1.1","time_to_first_byte":null,"bytes_sent":91,"request_id":"xxxxxxxxxxxx","user_agent":null,"method":"POST","upstream_host":"x.x.x.x:8080","response_code":503,"start_time":"2022-03-11T10:55:54.458Z","requested_server_name":null,"response_code_details":"upstream_reset_before_response_started{connection_failure}","duration":3,"response_flags":"UF","path":"/eventhandler","upstream_service_time":null,"bytes_received":104,"authority":"xyz.dns.com"}
@shashankram Do you have any update on this issue ? Please do let me know if you need any more details from my side. Thanks for your time!
upstream_reset_before_response_started
The log indicates the connection to the upstream (destination) service was reset. @chandan9778 I think the next step would be to provide a standalone repro so that we can take a look. Please let me know if that's possible.
@shashankram Thanks for your response! But for now providing standalone repo is not possible . Do you have any other way of doing it ?
@chandan9778, in that case, please share the following info (redact sensitive info but preserve config correctness).
Note: You may have provided some of these, but specifying a more comprehensive list:
- k8s service yaml
- Nginx HTTP ingress yaml
- Request that is failing (e.g. curl http://foo.bar:8080/baz)
- OSM IngressBackend yaml
- OSM MeshConfig yaml
- Complete Envoy sidecar log from backend service pod
- Config dump of backend sidecar:
osm proxy get config_dump <pod> -n <namespace>
- Stats dump of backend sidecar:
osm proxy get stats <pod> -n <namespace>
- OSM MeshConfig yaml:
kubectl get meshconfig osm-mesh-config -n <osm namespace>
- Logs of
osm-controller
pod in osm namespace
Root cause was determined to be #4653; closing in favor of that issue
Bug description: Currently we have installed osm on AKS cluster once osm is enabled on namespace level we are unable to access any ingress based(Exposed via nginx ingress controller) application and getting upstream connect error or disconnect/reset before headers. reset reason: connection failure after envoy Affected area (please mark with X where applicable):
Expected behavior:
Steps to reproduce the bug (as precisely as possible):
How was OSM installed?: USing azure CLI. https://docs.microsoft.com/en-us/azure/aks/open-service-mesh-deploy-addon-az-cli Anything else we need to know?:
Environment:
osm version
):mcr.microsoft.com/oss/openservicemesh/osm-controller:v1.0.0kubectl version
): 1.21.2