Open smarsh-tim opened 2 years ago
I tracked down the issue. I would like to see some additional logging in emissary pod logs when it switches back to HTTP even though it's defined as HTTPS.
Ultimately this issue seems to have been a cascaded problem from cert-manager - where an upgrade neglected to update the RBAC permissions: https://github.com/aws/eks-anywhere/issues/1572
Somehow, the certificate was still reporting as Ready. I'm not sure on the root-cause relationship of what specifically cascaded to breaking emissary from using that self-reporting Ready Certificate. But once I resolved cert-manager's RBAC problems this issue went away and emissary is now listening on HTTPS again.
If we could convert this into a feature request for additional logging - it would save a lot of time for future debugging.
Thanks a lot for the thorough investigation @smarsh-tim! To your suggestion, I'll label this issue as a feature request for added logs.
I did some further testing today, and now my endpoint is still showing two HTTP listeners created in the logging. However, HTTPS requests are resolving inconsistently.
Some times the request goes through successfully, other times I get this still:
curl: (35) error:1400410B:SSL routines:CONNECT_CR_SRVR_HELLO:wrong version number
No changes have occurred to the TLS certificate. I'm wondering if it's somehow conflicting with the other ambassador_id load balancers I have in the same Kubernetes cluster. Currently I'm running 4 distinct emissary load balancers, 3 using labelled ambassador_id
values and one without as the default. All associated resources for the labelled load balancers have their own ambassador_id
values set too.
Very curious transient issue. I will keep digging.
I think I found a root cause. I have multiple emissary-ingress instances, and two of them had incidentally grabbed the same External IP with kube-vip. Took a while to find since I wasn't looking for that particular.
kubectl get service -n emissary
Then check if any of the EXTERNAL-IP
are the same. They shouldn't be.
In 3.9.1
I continue to see this behavior where a listener defined with protocol: HTTPS
is not listening for HTTPS, but rather HTTP.
Describe the bug A HTTPS is being created an HTTP. This occurs at some point after initial deploy, as when the first deployed the listener is on HTTPS correctly.
This ultimately leads to this error when attempting to connect to the HTTPS endpoint:
But then this works fine:
I am unable to find any related error messaging in any of the emissary pod logging, or resource describes, or k8s events.
These are the listener resources:
But the protocol for
emissary-ingress-https-listener
isn't being respected, and instead is generating as HTTP instead of HTTPS. There are no TLS errors being reported in emissary pod logging as described would be here: https://www.getambassador.io/docs/emissary/latest/topics/running/tls/#certificates-and-secretsI am able to get it back to listening on HTTPS, but only by deleting each of Listeners, HelmRelease, Mappings, and TLSContext for emissary, then re-applying the resource manifests.
It's not the TLS certificate stored as a secret, since that doesn't change and it starts working again after re-apply. And it's not RBAC permissions either, since those don't change during any of this. With that I'm confident I can rule-out cert-manager or the acme provider.
To Reproduce Steps to reproduce the behavior:
Expected behavior A clear and concise description of what you expected to happen.
Versions (please complete the following information):
Additional context Here are the other components for TLS: