Open iNoahNothing opened 4 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Is this an issue for other people as well. We have what we believe is the same issue, but switching to a wildcard (workaround 2) didn't work for us. We had to disable h2 for things to start working again, but ideally we'd like to re-enable h2 with a fix for this.
~Another workaround that did work for us was to get rid of the TLSContext completely. The Host already specifies the certificate secret, and you can put configuration parameters like the alpn_protocols directly in the host. The TLSContext isn't actually needed (for most cases).~ I don't know why I said this. This doesn't actually work.
According to @cindymullins-dw, this was fixed in 1.7 in August 2020, so this ticket can possibly be closed.
This should have been closed along with https://github.com/datawire/apro/issues/1167 (which is just a mirror of the issue) by the PR https://github.com/datawire/apro/pull/1716 (seen in Emissary as https://github.com/emissary-ingress/emissary/commit/df926097c872e09851525de7ffeaa6e8577670d0) (which did close that mirror issue), which was included in v1.7.0 on 2020-08-27.
Though https://github.com/datawire/apro/pull/1907 (seen in Emissary as https://github.com/emissary-ingress/emissary/commit/7835f087056f5ba589ec19c51ff274107988b907) (for inclusion in v1.7.4) (reverted a bunch of changes to v2listener.py
, it specifically did not revert the changes from datawire/apro#1716 because (as the commit message says) "is a fix that EPO cares about."
@LukeShu - as we discussed offline, this fix didn't cover all the cases so I'm going to reopen this and outline what we discussed so we have a record of it.
Case 1: Coalesce wildcard sub-domains (ie. a.example.com
, b.example.com
) - ✅
Case 2: Coalesce wildcard sub-domains with parent domain (i.e. a.example.com
, b.example.com
and example.com
) - ⛔
The first case was resolved per the fix that you referenced which means we will coalesce all the wild-card subdomains into a single envoy Filter Chain that does SNI matching on *.example.com
.
In the second case, when a TLS certificate has SAN names registered for both wild-card domains and parent domain then the browser will try to re-use the connection.
X509v3 Subject Alternative Name:
DNS:*.example.com, DNS:example.com
We currently generate Envoy configuration so that we have two FIlter Chains that do the L4 SNI matching for *.example.com
and one for example.com
. Navigating to a wild-card domain first will open a connection and the TLS Handshake will use the .example.com domain for SNI. The browser will re-use the open connection when navigating directly to the parent domain. Since SNI is negotiated at TLS Handshake time, Envoy will re-use the connection and looks in the Filter Chain for `.example.comand then when it tries to do the L7 matching on
:authority == example.com`, there is no route available causing the 404 NR.
Chrome: net-internals shows the same connection being used for the wildcard and parent domain.
Workaround:
A non-code solution is for the user to use two different TLS Certs. One that has SAN for the wild card domain and the another one for the parent domain. By doing this the browser will re-use the existing connection for all wild-card domains (i.e. a.example.com
, b.example.com
) but will open a new connection for requests to the parent domain (example.com
) since they no longer share a cert and can re-use the connection.
Chrome: net-internals using different connections when using the workaround.
Potential Fix: Emissary will need to take into account the TLS Certs and the SANs registered within the cert along with the host matching to ensure that when the browser re-uses the connection that both the wild-card domains and parent domain can be matched in a single Filter Chain.
FYI... @ddymko @haq204 @AliceProxy I think this is a good one to be aware of.
I was also seeing these symptoms for probably the same reason, but my underlying issue and solution were a little more involved, and there were probably ultimately multiple issues.
I set up a second Certificate, Host, and TLSContext as described here, in order to serve subdomains on a different cert than an apex domain.
However, my second Certificate was not becoming Ready -- in particular, cert-manager wasn't producing a challenge because it failed to match the second certificate to any solver. It was unclear if that was due to misconfiguration of a Host/TLSContext/etc.
In my case that configuration was correct but the underlying issue is that lets encrypt specifically doesn't support http01 challenges for wildcard domains.
Switching from http01 challenge solver to dns01 challenge solver allowed the challenge to be produced, which in turn made the second certificate become ready, and the issue went away.
A key thing to notice is that the cert-manager flow replaces the cert secrets only after a flow completes successfully, which means that if it doesn't complete successfully, existing config continues to be used, which led to some confusion here when I continued to see the apex cert being served on subdomains.
One possible improvement in emissary might be to refuse to serve the "apex" cert in this case across domains since it's a known issue, unless the user opts-in to reuse using a feature flag like allow_unsafe_ssl_cert_subdomain_reuse: true
(Just an idea -- my case is completely resolved now. Thanks for your work on this!)
When you have multiple domains use the same certificate (e.g. the server has a certificate that can be used for domain
domain.com
and subdomainsa.domain.com
andb.domain.com
) and the server supports HTTP/2, the browser will reuse the same connection for requests todomain.com
,a.domain.com
, andb.domain.com
. See this blog post for more info.If you have created an individual
virtual_host
for each of these domains in Ambassador (viaHost
resources orTLSContext
s) Ambassador will reuse the samevirtual_host
, but with a differenthost
in the request and you will get a404
.In more detail, if you create a
TLSContext
like the below:You will get an Ambassador configured where:
virtual_host
s, one for each of thehosts
.ambassador-cert
is pointing at a certificate that works fordomain.com
,a.domain.com
andb.domain.com
so Ambassador is able to use the same certificate for each of thesevirtual_host
salpn_protocols: h2,http/1.1
is set so the browser will useHTTP/2
for the connection to AmbassadorNow, when you send a request to https://a.domain.com/ambassador/v0/diag/ in a web browser, it opens a single HTTP/2 connection to Ambassador with
:authority: a.domain.com
. Ambassador then looks for a route invirtual_host: a.domain.com
, find the route to/ambassador/v0
, and correctly sends the request to the diagnostics page.Now if you change the url to https://b.domain.com/ambassador/v0/diag/, the browser will reuse this same HTTP/2 connection to Ambassador but with
:authority: b.domain.com
. Ambassador then, reusing the same connection tovirtual_host: a.domain.com
, looks for a route invirtual_host: a.domain.com
but since the:authority
headers do not match any routes, returns a 404.To Reproduce
Reproduction is pretty simple.
Deploy Ambassador
Get a certificate for
*.domain.com
Create a
TLSContext
that uses that certificate and setsSend a request to https://a.domain.com/ambassador/v0/diag/ in a browser and get the diag page
Change the url to https://b.domain.com/ambassador/v0/diag/ and get a 404
Workaround
Since this issue revolves around how Ambassador is creating
virtual_hosts
and using the same certificate, a couple of possible workarounds exist that could be used until this is resolved.Create a different certificate and
TLSContext
for each domainThis will make is so the browser does not reuse the same connection for a.domain.com and b.domain.com since it cannot use the same certificate.
Use a wildcard in the
TLSContext
so thatdomain.com
,a.domain.com
, andb.domain.com
use the samevirtual_host
Now, when the browser reuses the connection, Ambassador will use the same
virtual_host
which will match for all:authority
s