Closed benpettman closed 4 years ago
From the Cloud Map version of the configuration you've shared, it looks like the virtual nodes search-v1
and search-v2
are indistinguishable from each other because they both have the exact same service discovery settings (namespaceName: search.travelverse.local
, serviceName: api-cm3
). This is not the case with your DNS example (api-search-v1.search.svc.cluster.local
and api-search-v2.search.svc.cluster.local
.
So what's happening is, regardless which weighted target Envoy selects, you're likely getting a mixture of responses from virtual nodes search-v1
and search-v2
.
You can distinguish between the two of them by providing a unique Cloud Map attribute for each node, which should resolve this issue. For example:
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: search-v1
namespace: search-cm
spec:
podSelector:
matchLabels:
app: api-cm3
version: v1
listeners:
- portMapping:
port: 7000
protocol: http
serviceDiscovery:
awsCloudMap:
namespaceName: search.travelverse.local
serviceName: api-cm3
attributes:
- key: version
value: v1
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
name: search-v2
namespace: search-cm
spec:
podSelector:
matchLabels:
app: api-cm3
version: v2
listeners:
- portMapping:
port: 7000
protocol: http
serviceDiscovery:
awsCloudMap:
namespaceName: search.travelverse.local
serviceName: api-cm3
attributes:
- key: version
value: v2
Thank you @bcelenza, this has fixed my issue. I was pulling my hair out, namely because the examples that I have used and looked at, dj app, color app in the aws-app-mesh-examples GitHub repo do not mention this at all.
In fact I think I based my yaml above on the colorapp example that recently got updated for the beta2, plus all the examples have a slightly different approach which adds to the confusion.
Nether the less, you have managed to help out and fix the problem so hopefully others will find it useful.
One more question, if I was running multiple clusters would they all need to be part of the same mesh in order for everything to work?
For example can I have cluster 1 in its own mesh and cluster 2 in its own mesh and they call each other and their respective meshes route the traffic? Or is this what the VirtualGateway thats in preview help with?
Thank you @bcelenza, this has fixed my issue. I was pulling my hair out, namely because the examples that I have used and looked at, dj app, color app in the aws-app-mesh-examples GitHub repo do not mention this at all.
In fact I think I based my yaml above on the colorapp example that recently got updated for the beta2, plus all the examples have a slightly different approach which adds to the confusion.
We've recently upgraded the controller, so it's possible there were some mix-ups in our demos during the transition. I've published aws/aws-app-mesh-examples#313 to correct the current Cloud Map walkthrough.
One more question, if I was running multiple clusters would they all need to be part of the same mesh in order for everything to work?
For example can I have cluster 1 in its own mesh and cluster 2 in its own mesh and they call each other and their respective meshes route the traffic? Or is this what the VirtualGateway thats in preview help with?
While you can have multiple clusters that are part of the same mesh, App Mesh makes a lot of assumptions of L3/L4 connectivity and routing being in place, along with things like consistent service discovery between clusters. For example, if you had one mesh across two clusters, you'd likely want to use a single DNS source of truth for both clusters so you don't run into naming conflicts, and have both clusters within a single VPC. As I'm sure you can imagine, a lot of complexity is added along the way. :)
Instead of dealing with all of that, we recommend the Virtual Gateway approach you mention. This way you can use the cluster boundary and gateways to present one or more interfaces to the world outside a given cluster, and control access to components within the cluster and mesh.
We're in the process of making the gateway generally available, so it's just around the corner. Stay tuned!
Fantastic. Thank you.
Does this mean then if I were to use a VG as the logical boundary to each cluster then I wouldn’t need to use cloud map would I? As I would just be called each clusters VG respectively
That’s right. The decision to use Cloud Map or DNS for service discovery is largely a decision for within a cluster or mesh. From the perspective of a service in one cluster talking to a service in another, the Virtual Gateway provides the logical boundary, and you would use a load balancer in front of the Virtual Gateway to distribute traffic. So the services in Cluster A just need to know the IP address(es) of the load balancer in Cluster B that’s fronting the gateway. The load balancer itself would likely use DNS — either public or from your VPC’s private hosted zone, depending on whether your clusters are within the same network.
Cool. So the VirtualGateway will allow effectively a multi mesh approach inside one or multiple VPCs seeing as the gateway is a public load balancer.
So if I didn’t want to do that and went for an approach of multi cluster in one mesh I would just need some kind of api gateway internal to the cluster that’s part of the mesh so traffic is routed via the mesh and adhering to the router rules etc?
I dont know if its the new controller version, but when trying to follow the blog post that illustrates a cross cluster implementation I get various errors, such as when deploying the mesh and all components to cluster 2, and then deploying the "front" end to cluster 1, I get errors from the mesh saying it requires a matching virtual node, which is deployed into cluster 2 as per the blog.
I have 2 clusters, 1 vpc. The CRDs and controller are deployed to both, and the mesh components, VS, VN, Mesh etc are deployed to cluster 2. Cluster 1 just has the namespace, service and deployment.
So then I cant get a curl working across cluster, any pointers?
I would of thought by using the same everything the controller would of pulled the mesh and components into cluster 1 and used them?
Current yaml in cluster one, produces Error creating: admission webhook "mpod.appmesh.k8s.aws" denied the request: sidecarInject enabled but no matching VirtualNode found
apiVersion: v1
kind: Namespace
metadata:
labels:
appmesh.k8s.aws/sidecarInjectorWebhook: enabled
mesh: search-api-cm3
name: search-cm
---
apiVersion: v1
kind: Service
metadata:
name: curler
namespace: search-cm
spec:
ports:
- port: 8080
name: http
selector:
app: curler
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: curler
namespace: search-cm
spec:
replicas: 1
selector:
matchLabels:
app: curler
template:
metadata:
annotations:
appmesh.k8s.aws/mesh: search-api-cm3
appmesh.k8s.aws/virtualNode: curler
labels:
app: curler
spec:
containers:
- name: blue-website
image: scrapinghub/httpbin:latest
command:
- sleep
- "3600"
resources:
requests:
cpu: 0.1
memory: 200
So if I didn’t want to do that and went for an approach of multi cluster in one mesh I would just need some kind of api gateway internal to the cluster that’s part of the mesh so traffic is routed via the mesh and adhering to the router rules etc?
Really you just need:
Current yaml in cluster one, produces Error creating: admission webhook "mpod.appmesh.k8s.aws" denied the request: sidecarInject enabled but no matching VirtualNode found
I think this is a mismatch between the version of the controller you're using and the example you've pulled from. In the v1, generally available version of the controller, annotations are no longer used on the Deployment resource in favor of selectors on the Virtual Nodes / Virtual Gateways. Per (3) above, you'll want to make sure the App Mesh resources you're defining are applied via kubectl
to the Cluster they'll run on (so the selectors also know how to find the right pods to inject Envoy into). I see you've opened another issue regarding the cross-account/cluster demo, so it would be good to do any follow-up questions or concerns on that issue.
Hi @bcelenza i am not using a mismatched version, I am using v1 and the annotations are there in a vain attempt to get this working.
The documentation neither says use or not use annotations so I am clutching at straws.
Even without the annotations on the deployment I still get the same error.
Are you saying the mesh and virtual nodes need to be installed in both clusters? If so how does that work as the virtual node has a backend ref to a virtual service that runs in the other cluster, so it’s like pulling a thread, if I deploy the VN I would also need to deploy the VS and then the entire deployment for what I’m trying to use which then negates the point in a multi cluster approach.
This is my reason for opening the other issue, as a lot of the other examples were upgraded to the latest but that one wasn’t.
So I’m seemingly struggling to get this to work and need to get it working or find another solution.
Sorry, my suggestion was just to move diagnosing and fixing the cross-cluster issue over to aws/aws-app-mesh-examples#314 since this issue was cut with regards to Cloud Map service discovery. I'd like to close this issue out and keep it about Cloud Map as much as possible in case others come to search for similar things.
Would you mind if we continue the conversation about cross-cluster on your other issue?
Closing this issue. Please re-open if you need any more help with Cloud Map service discovery.
I got a mesh running with dns discovery and everything works as expected, then using the howto examples I swapped out the dns discovery for
awsCloudMap
discovery, after some tweaking of references I can exec into my curler pod that is part of the mesh (the one that was working when using local dns) and then made a curl out to the CloudMap url.I get a response, but the response is just a round robin over the 2 pods that have been registered to the service, my assumption would of been this would of been routed via the envoy proxy in the curler pod, out to CloudMap and then back to the Mesh and routed me as it did when using local dns.
It seems that CloudMap gets the pods running as their endpoints and then by calling the CloudMap URL it negates the mesh entirely and just replies with each pod in turn. See curl output below:
It should just be replying with
{"healthStatus":true}
unless a header is suppliedBelow are my test files, both dns and cloudmap versions. Again please advise if this is wrong.
DNS:
CloudMap: