aws / aws-app-mesh-roadmap

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication
Apache License 2.0
347 stars 25 forks source link

Feature Request: Increase the maximum number of Virtual Gateways per Mesh #449

Open mkielar opened 1 year ago

mkielar commented 1 year ago

If you want to see App Mesh implement this idea, please upvote with a :+1:.

Tell us about your request Allow more than 5 Virtual Gateways per Mesh.

Which integration(s) is this request for? N/A

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We're building an API that will be exposed for multiple brands. Each brand will get it's own domain, and we are utilizing CloudFront and API Gateway to inject a x-brand-id header, so that our backends would know which brand they currently serve.

To get proper visibility of what's going on, we want to build following "ingress stack" for every brand:

CloudFront => API GW => VPC Link => NLB => Virtual Gateway

This allows us to utilize CloudWatch Metrics that Virtual Gateway generates to have "per-brand" metrics / dashboard / alerting in place based on the VirtualGateway => TargetVirtualNode metrics that it generates.

We're currently in a process of running two new brands, and migrating one of the old ones to new infrastructure setup. To process the migration we needed to increase the "Virtual Gateways per Mesh" limit from (initial) 3. We requested increasing that limit to 10 Virtual Gateways, which should future-proof the system in case any new brand appears later next year (which is planned).

AWS Support response was that - at the time - the maximum Virtual Gateways per Mesh they can do is 5.

Are you currently working around this issue? We're not, but I'd be happy to hear ideas...

Additional context

  1. I don't understand this limit at all. AppMesh is merely a metadata store. It's the Envoy that does all the heavy lifting, and AppMesh merely provides it with some JSON to configure itself. How, in a world of autoscaling clouds and systems running on thousands of CPUs would anyone thing a fixed-size, 5-position table is a good idea...?
  2. The support tickets we've raised: 11421690101 / 11421583251 / 11421612421 / 11421625461
  3. In 11421625461 we're currently having the conversation about the cap at 5 VGs.
mkielar commented 1 year ago

I think at least the Service Quota documentation (https://docs.aws.amazon.com/general/latest/gr/appmesh.html#limits_appmesh) could be improved. It currently only states the initial limit (3) and that it is adjustable, but not that there's hard cap, and that it's as low as 5.

That information was crucial in our design, and we would build things differently from day one if we knew there are such constraints (perhaps even not use Virtual Gateways at all and fall back to using Nginx-wrapped-with-Envoy Virtual Nodes, like I used to do before VGs became a thing).