Open bcelenza opened 4 years ago
Thanks for raising this @bcelenza , just adding a comment to confirm we are definitely interested in this being supported. Happy to see if we can be of any assistance talking through the details etc. Cheers!
@eddgrant No problem!
For your particular use case, how important is it for the upstream application to terminate TLS instead of the Envoy?
One of the things I'm thinking about is how we could configure Envoy to optionally allow the application behind it to negotiate TLS. Basically, if the application calls https://foo, Envoy let's it do the negotiation. If the application calls http://foo, the Envoy negotiates TLS on its behalf.
The big caveat to this approach is, when the application negotiates, the Envoy could only proxy TCP traffic, so you'd lose some of the benefits of HTTP proxying (like retry policies, timeouts, access logs, HTTP-specific metrics). But that might be a reasonable trade-off for not having to terminate/re-encrypt at the Envoy, and all the configuration complexity and performance degradation that comes with it?
There would also be some loss in routing capabilities with this approach, so might not be the best route (pun intended).
Good questions!
how important is it for the upstream application to terminate TLS instead of the Envoy?
When you say "terminate" here, are you talking about inbound calls to the microservices, or outbound calls that they make? While our microservices negotiate TLS whilst making outbound calls, they don't actually terminate inbound TLS themselves, that's done by an nginx sidecar in their ECS task. I've tried to draw this below:
One of the things I'm thinking about is how we could configure Envoy to optionally allow the application behind it to negotiate TLS.
I think this would be really useful in our use case, totally agree that it comes with caveats, however if it allowed us to migrate everything over then I imagine gradually converting the services to make plain http outbound calls would be a much easier task once they're all in ECS.
We also had a couple of similar ideas, we wondered about configuring port 443 as an EgressIgnoredPorts
and then somehow routing the traffic to an NLB which itself forwarded to some sort of "routing task" which ran within the mesh. All feels rather sticky-taped together though but we're going to spike it to see if we learn anything.
There would also be some loss in routing capabilities with this approach, so might not be the best route (pun intended).
:rofl: - but yes I agree, there are definitely some tradeoffs, however if I can figure out how to make them temporary by this being just an initial step to get us through the migration, then I think they'd probably be tolerable.
I'm not sure if this issue is most appropriate to raise this feedback about desiring this feature, this was one of a few mentioned to me by AWS support.
We have an interest in our app negotiating TLS with envoy rather than the remote service for an https service on the internet. An example of our use case is below
This setup is being done in an environment that currently uses the traditional approach of load balancers and we are migrating the platform to use app mesh. Not all migrations can be done at the same time, and zero-downtime deploys are super important, however the above example is generally net-new infrastructure. The origin server that services www.example.com does live in AWS and is not new infrastructure.
The app container in question is an isomorphic javascript app (runs on server and browser) and in an ideal situation, the code/config utilized in the app does not change depending on where it's executing. For accessing services via routes that had not been onboarded yet, we'd like to default the config to utilize the internet-accessible dns-based virtual node. When a legacy/new service gets on-boarded, we would add a route to the www.example.com's virtual service's router to send directly to another virtual node within the network.
Right now, we don't have a requirement to access services over https within our network, and if we did, the cert given at https://www.example.com would be different than the cert given by the internal infrastructure. This, combined with the fact that a network proxy (envoy) can't use http to route the request because the tls originates from the app and terminates at our CDN (not amazon), resulting in an encrypted request that envoy can't parse.
For this reason, we'd like the app to originate TLS and envoy to terminate it so that envoy can decrypt the request and understand it and forward it to another backend, local or not, potentially with TLS to the new backend (or not, in the case of internal infrastructure).
For now, we have approached the problem differently and given the application different urls to hit when it's processing server side versus client side, resulting in an http request that envoy can then route to the origin server of the CDN so it never leaves our VPC. This works for us for now, but there's ongoing discussion of adding routing logic at the CDN layer which this approach would inadvertently bypass.
Thanks for raising this @bcelenza, we would love to have this being supported. In our case, ecs services are contacting other ecs services through ALB where tls connections terminate at ALB and ALB talks to target ecs service in plain http protocol. Just to clarify, both source ecs service and target ecs service are inside mesh.
Tell us about your request
Per this comment on #39, there is a need to have the downstream application negotiate TLS with the upstream service, instead of having the local proxy perform negotiation on the application's behalf. This can be useful for assisting clients who need to migrate to service mesh (and eventually drop encryption between the application and proxy), or for additional security assurances that all traffic is encrypted at each hop in the network.
Which integration(s) is this request for? All
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? In cases where a central infrastructure or platform team is supporting many different teams, it can be difficult for those teams to migrate to service mesh solutions because the applications will still attempt to negotiate TLS (e.g. a GitHub client will still try to reach out to https://github.com). When applications attempt to negotiate TLS, App Mesh cannot apply Layer 7 (e.g. HTTP) routing rules, because the information is encrypted to the proxy.
Additionally, transitioning to an approach where the proxy negotiates TLS on behalf of the client sometimes requires additional overhead and coordination between teams, and some organizations may require the application to perform TLS negotiation instead of the proxy.
Are you currently working around this issue? Currently in order to encrypt traffic from the application, to the Envoy (and beyond), all backends need to be modeled as TCP instead of HTTP, so that Envoy is only proxying TCP traffic and not attempting to negotiate TLS on behalf of the application.