aws / aws-app-mesh-roadmap

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication
Apache License 2.0
347 stars 25 forks source link

Question: CloudMap discovery, does it use the mesh? #225

Closed benpettman closed 4 years ago

benpettman commented 4 years ago

I got a mesh running with dns discovery and everything works as expected, then using the howto examples I swapped out the dns discovery for awsCloudMap discovery, after some tweaking of references I can exec into my curler pod that is part of the mesh (the one that was working when using local dns) and then made a curl out to the CloudMap url.

I get a response, but the response is just a round robin over the 2 pods that have been registered to the service, my assumption would of been this would of been routed via the envoy proxy in the curler pod, out to CloudMap and then back to the Mesh and routed me as it did when using local dns.

It seems that CloudMap gets the pods running as their endpoints and then by calling the CloudMap URL it negates the mesh entirely and just replies with each pod in turn. See curl output below:

root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health
{"healthStatus":true}
root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health
{"healthStatus":"thisshouldbetrue"}
root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health
{"healthStatus":true}
root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health
{"healthStatus":"thisshouldbetrue"}
root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health
{"healthStatus":true}
root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health
{"healthStatus":"thisshouldbetrue"}
root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health
{"healthStatus":true}
root@curler-574d66489f-tnf5j:/# curl api-cm3.search.travelverse.local/health

It should just be replying with {"healthStatus":true} unless a header is supplied

Below are my test files, both dns and cloudmap versions. Again please advise if this is wrong.

DNS:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    appmesh.k8s.aws/sidecarInjectorWebhook: enabled
    mesh: search-api
  name: search

---

apiVersion: appmesh.k8s.aws/v1beta2
kind: Mesh
metadata:
  name: search-api
spec:
  namespaceSelector:
    matchLabels:
      mesh: search-api

---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: search-v1
  namespace: search
spec:
  podSelector:
    matchLabels:
      app: api
      version: v1
  listeners:
    - portMapping:
        port: 7000
        protocol: http
  serviceDiscovery:
    dns:
      hostname: api-search-v1.search.svc.cluster.local

---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: search-v2
  namespace: search
spec:
  podSelector:
    matchLabels:
      app: api
      version: v2
  listeners:
    - portMapping:
        port: 7000
        protocol: http
  serviceDiscovery:
    dns:
      hostname: api-search-v2.search.svc.cluster.local

---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
  name: api
  namespace: search
spec:
  awsName: api.search.svc.cluster.local
  provider:
    virtualRouter:
      virtualRouterRef:
        name: api
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
  namespace: search
  name: api
spec:
  listeners:
    - portMapping:
        port: 80
        protocol: http
  routes:
    - name: health-route
      priority: 11
      httpRoute:
        match:
          prefix: /health
        action:
          weightedTargets:
            - virtualNodeRef:
                name: search-v1
              weight: 1
    - name: health-route-header
      priority: 10
      httpRoute:
        match:
          prefix: /health
          headers:
            - name: health_header
              match:
                exact: blue
        action:
          weightedTargets:
            - virtualNodeRef:
                name: search-v2
              weight: 1
    - name: api-route
      priority: 10
      httpRoute:
        match:
          prefix: /search
        action:
          weightedTargets:
            - virtualNodeRef:
                name: search-v1
              weight: 1
            - virtualNodeRef:
                name: search-v2
              weight: 1
---

apiVersion: v1
kind: Service
metadata:
  name: api-search-v1
  namespace: search
spec:
  ports:
    - port: 7000
      name: http
  selector:
    app: api
    version: v1
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: search-v1
  namespace: search
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api
      version: v1
  template:
    metadata:
      metadata:
      annotations:
        appmesh.k8s.aws/egressIgnoredPorts: "3306"
      labels:
        app: api
        version: v1
    spec:
      containers:
      - name: search-one
        image: xxx
        env:
        - name: NODE_ENV
          value: production
        - name: DB_HOST
          value: slave.search.travelverse.local
        - name: DB_USER
          value: root
        - name: DB_PASS
          value: CSlp6JwXGz
        - name: DB_NAME
          value: holidaycottages

        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "128Mi"
            cpu: "200m"
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
      imagePullSecrets:
      - name: gitlab-auth-token
---
apiVersion: v1
kind: Service
metadata:
  name: api-search-v2
  namespace: search
spec:
  ports:
    - port: 7000
      name: http
  selector:
    app: api
    version: v2
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: search-v2
  namespace: search
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api
      version: v2
  template:
    metadata:
      metadata:
      annotations:
        appmesh.k8s.aws/egressIgnoredPorts: "3306"
      labels:
        app: api
        version: v2
    spec:
      containers:
      - name: search-one
        image: xxx
        env:
        - name: NODE_ENV
          value: production
        - name: DB_HOST
          value: slave.search.travelverse.local
        - name: DB_USER
          value: root
        - name: DB_PASS
          value: CSlp6JwXGz
        - name: DB_NAME
          value: holidaycottages

        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "128Mi"
            cpu: "200m"
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
      imagePullSecrets:
      - name: gitlab-auth-token

---
apiVersion: v1
kind: Service
metadata:
  name: api
  namespace: search
  labels:
    app: api
spec:
  ports:
    - port: 7000
      name: http
  selector:
    app: api
---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: curler
  namespace: search
spec:
  podSelector:
    matchLabels:
      app: curler
  listeners:
    - portMapping:
        port: 8080
        protocol: http
  backends:
    - virtualService:
        virtualServiceRef:
          name: api
  serviceDiscovery:
    dns:
      hostname: curler.search.svc.cluster.local

---

apiVersion: v1
kind: Service
metadata:
  name: curler
  namespace: search
spec:
  ports:
    - port: 8080
      name: http
  selector:
    app: curler
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: curler
  namespace: search
spec:
  replicas: 1
  selector:
    matchLabels:
      app: curler
  template:
    metadata:
      labels:
        app: curler
    spec:
      containers:
      - name: blue-website
        image: scrapinghub/httpbin:latest
        command:
        - sleep
        - "3600"
        resources:
          requests:
            cpu: 0.1
            memory: 200

CloudMap:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    appmesh.k8s.aws/sidecarInjectorWebhook: enabled
    mesh: search-api-cm3
  name: search-cm

 ---

apiVersion: appmesh.k8s.aws/v1beta2
kind: Mesh
metadata:
  name: search-api-cm3
spec:
  namespaceSelector:
    matchLabels:
      mesh: search-api-cm3

---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: search-v1
  namespace: search-cm
spec:
  podSelector:
    matchLabels:
      app: api-cm3
      version: v1
  listeners:
    - portMapping:
        port: 7000
        protocol: http
  serviceDiscovery:
    awsCloudMap:
      namespaceName: search.travelverse.local
      serviceName: api-cm3

---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: search-v2
  namespace: search-cm
spec:
  podSelector:
    matchLabels:
      app: api-cm3
      version: v2
  listeners:
    - portMapping:
        port: 7000
        protocol: http
  serviceDiscovery:
    awsCloudMap:
      namespaceName: search.travelverse.local
      serviceName: api-cm3

---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualRouter
metadata:
  namespace: search-cm
  name: api-cm3
spec:
  listeners:
    - portMapping:
        port: 80
        protocol: http
  routes:
    - name: health-route
      priority: 11
      httpRoute:
        match:
          prefix: /health
        action:
          weightedTargets:
            - virtualNodeRef:
                name: search-v1
              weight: 1
    - name: health-route-header
      priority: 10
      httpRoute:
        match:
          prefix: /health
          headers:
            - name: health_header
              match:
                exact: blue
        action:
          weightedTargets:
            - virtualNodeRef:
                name: search-v2
              weight: 1
    - name: api-cm3-route
      priority: 10
      httpRoute:
        match:
          prefix: /search
        action:
          weightedTargets:
            - virtualNodeRef:
                name: search-v1
              weight: 1
            - virtualNodeRef:
                name: search-v2
              weight: 1

---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualService
metadata:
  name: api-cm3.search.travelverse.local
  namespace: search-cm
spec:
  awsName: api-cm3.search.travelverse.local
  provider:
    virtualRouter:
      virtualRouterRef:
        name: api-cm3
        namespace: search-cm
---

apiVersion: v1
kind: Service
metadata:
  name: api-cm3-search-v1
  namespace: search-cm
spec:
  ports:
    - port: 7000
      name: http
  selector:
    app: api-cm3
    version: v1
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: search-v1
  namespace: search-cm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api-cm3
      version: v1
  template:
    metadata:
      metadata:
      annotations:
        appmesh.k8s.aws/egressIgnoredPorts: "3306"
      labels:
        app: api-cm3
        version: v1
    spec:
      containers:
      - name: search-one
        image: xxx
        env:
        - name: NODE_ENV
          value: production
        - name: DB_HOST
          value: slave.search.travelverse.local
        - name: DB_USER
          value: root
        - name: DB_PASS
          value: CSlp6JwXGz
        - name: DB_NAME
          value: holidaycottages

        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "128Mi"
            cpu: "200m"
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
      imagePullSecrets:
      - name: gitlab-auth-token
---
apiVersion: v1
kind: Service
metadata:
  name: api-cm3-search-v2
  namespace: search-cm
spec:
  ports:
    - port: 7000
      name: http
  selector:
    app: api-cm3
    version: v2
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: search-v2
  namespace: search-cm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api-cm3
      version: v2
  template:
    metadata:
      metadata:
      annotations:
        appmesh.k8s.aws/egressIgnoredPorts: "3306"
      labels:
        app: api-cm3
        version: v2
    spec:
      containers:
      - name: search-one
        image: xxx
        env:
        - name: NODE_ENV
          value: production
        - name: DB_HOST
          value: slave.search.travelverse.local
        - name: DB_USER
          value: root
        - name: DB_PASS
          value: CSlp6JwXGz
        - name: DB_NAME
          value: holidaycottages

        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "128Mi"
            cpu: "200m"
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
      imagePullSecrets:
      - name: gitlab-auth-token

---
apiVersion: v1
kind: Service
metadata:
  name: api-cm3
  namespace: search-cm
  labels:
    app: api-cm3
spec:
  ports:
    - port: 7000
      name: http
  selector:
    app: api-cm3
---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: curler
  namespace: search-cm
spec:
  podSelector:
    matchLabels:
      app: curler
  listeners:
    - portMapping:
        port: 8080
        protocol: http
  backends:
    - virtualService:
        virtualServiceRef:
          name: api-cm3.search.travelverse.local
  serviceDiscovery:
    awsCloudMap:
      namespaceName: search.travelverse.local
      serviceName: curler

---

apiVersion: v1
kind: Service
metadata:
  name: curler
  namespace: search-cm
spec:
  ports:
    - port: 8080
      name: http
  selector:
    app: curler
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: curler
  namespace: search-cm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: curler
  template:
    metadata:
      labels:
        app: curler
    spec:
      containers:
      - name: blue-website
        image: scrapinghub/httpbin:latest
        command:
        - sleep
        - "3600"
        resources:
          requests:
            cpu: 0.1
            memory: 200
bcelenza commented 4 years ago

From the Cloud Map version of the configuration you've shared, it looks like the virtual nodes search-v1 and search-v2 are indistinguishable from each other because they both have the exact same service discovery settings (namespaceName: search.travelverse.local, serviceName: api-cm3). This is not the case with your DNS example (api-search-v1.search.svc.cluster.local and api-search-v2.search.svc.cluster.local.

So what's happening is, regardless which weighted target Envoy selects, you're likely getting a mixture of responses from virtual nodes search-v1 and search-v2.

You can distinguish between the two of them by providing a unique Cloud Map attribute for each node, which should resolve this issue. For example:

---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: search-v1
  namespace: search-cm
spec:
  podSelector:
    matchLabels:
      app: api-cm3
      version: v1
  listeners:
    - portMapping:
        port: 7000
        protocol: http
  serviceDiscovery:
    awsCloudMap:
      namespaceName: search.travelverse.local
      serviceName: api-cm3
      attributes:
        - key: version
           value: v1

---

apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: search-v2
  namespace: search-cm
spec:
  podSelector:
    matchLabels:
      app: api-cm3
      version: v2
  listeners:
    - portMapping:
        port: 7000
        protocol: http
  serviceDiscovery:
    awsCloudMap:
      namespaceName: search.travelverse.local
      serviceName: api-cm3
      attributes:
        - key: version
           value: v2
benpettman commented 4 years ago

Thank you @bcelenza, this has fixed my issue. I was pulling my hair out, namely because the examples that I have used and looked at, dj app, color app in the aws-app-mesh-examples GitHub repo do not mention this at all.

In fact I think I based my yaml above on the colorapp example that recently got updated for the beta2, plus all the examples have a slightly different approach which adds to the confusion.

Nether the less, you have managed to help out and fix the problem so hopefully others will find it useful.

One more question, if I was running multiple clusters would they all need to be part of the same mesh in order for everything to work?

For example can I have cluster 1 in its own mesh and cluster 2 in its own mesh and they call each other and their respective meshes route the traffic? Or is this what the VirtualGateway thats in preview help with?

bcelenza commented 4 years ago

Thank you @bcelenza, this has fixed my issue. I was pulling my hair out, namely because the examples that I have used and looked at, dj app, color app in the aws-app-mesh-examples GitHub repo do not mention this at all.

In fact I think I based my yaml above on the colorapp example that recently got updated for the beta2, plus all the examples have a slightly different approach which adds to the confusion.

We've recently upgraded the controller, so it's possible there were some mix-ups in our demos during the transition. I've published aws/aws-app-mesh-examples#313 to correct the current Cloud Map walkthrough.

One more question, if I was running multiple clusters would they all need to be part of the same mesh in order for everything to work?

For example can I have cluster 1 in its own mesh and cluster 2 in its own mesh and they call each other and their respective meshes route the traffic? Or is this what the VirtualGateway thats in preview help with?

While you can have multiple clusters that are part of the same mesh, App Mesh makes a lot of assumptions of L3/L4 connectivity and routing being in place, along with things like consistent service discovery between clusters. For example, if you had one mesh across two clusters, you'd likely want to use a single DNS source of truth for both clusters so you don't run into naming conflicts, and have both clusters within a single VPC. As I'm sure you can imagine, a lot of complexity is added along the way. :)

Instead of dealing with all of that, we recommend the Virtual Gateway approach you mention. This way you can use the cluster boundary and gateways to present one or more interfaces to the world outside a given cluster, and control access to components within the cluster and mesh.

We're in the process of making the gateway generally available, so it's just around the corner. Stay tuned!

benpettman commented 4 years ago

Fantastic. Thank you.

Does this mean then if I were to use a VG as the logical boundary to each cluster then I wouldn’t need to use cloud map would I? As I would just be called each clusters VG respectively

bcelenza commented 4 years ago

That’s right. The decision to use Cloud Map or DNS for service discovery is largely a decision for within a cluster or mesh. From the perspective of a service in one cluster talking to a service in another, the Virtual Gateway provides the logical boundary, and you would use a load balancer in front of the Virtual Gateway to distribute traffic. So the services in Cluster A just need to know the IP address(es) of the load balancer in Cluster B that’s fronting the gateway. The load balancer itself would likely use DNS — either public or from your VPC’s private hosted zone, depending on whether your clusters are within the same network.

benpettman commented 4 years ago

Cool. So the VirtualGateway will allow effectively a multi mesh approach inside one or multiple VPCs seeing as the gateway is a public load balancer.

So if I didn’t want to do that and went for an approach of multi cluster in one mesh I would just need some kind of api gateway internal to the cluster that’s part of the mesh so traffic is routed via the mesh and adhering to the router rules etc?

benpettman commented 4 years ago

I dont know if its the new controller version, but when trying to follow the blog post that illustrates a cross cluster implementation I get various errors, such as when deploying the mesh and all components to cluster 2, and then deploying the "front" end to cluster 1, I get errors from the mesh saying it requires a matching virtual node, which is deployed into cluster 2 as per the blog.

I have 2 clusters, 1 vpc. The CRDs and controller are deployed to both, and the mesh components, VS, VN, Mesh etc are deployed to cluster 2. Cluster 1 just has the namespace, service and deployment.

So then I cant get a curl working across cluster, any pointers?

I would of thought by using the same everything the controller would of pulled the mesh and components into cluster 1 and used them?

Current yaml in cluster one, produces Error creating: admission webhook "mpod.appmesh.k8s.aws" denied the request: sidecarInject enabled but no matching VirtualNode found

apiVersion: v1
kind: Namespace
metadata:
  labels:
    appmesh.k8s.aws/sidecarInjectorWebhook: enabled
    mesh: search-api-cm3
  name: search-cm
---
apiVersion: v1
kind: Service
metadata:
  name: curler
  namespace: search-cm
spec:
  ports:
    - port: 8080
      name: http
  selector:
    app: curler
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: curler
  namespace: search-cm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: curler
  template:
    metadata:
      annotations:
        appmesh.k8s.aws/mesh: search-api-cm3
        appmesh.k8s.aws/virtualNode: curler
      labels:
        app: curler
    spec:
      containers:
      - name: blue-website
        image: scrapinghub/httpbin:latest
        command:
        - sleep
        - "3600"
        resources:
          requests:
            cpu: 0.1
            memory: 200
bcelenza commented 4 years ago

So if I didn’t want to do that and went for an approach of multi cluster in one mesh I would just need some kind of api gateway internal to the cluster that’s part of the mesh so traffic is routed via the mesh and adhering to the router rules etc?

Really you just need:

  1. Both clusters to operate on the same, flat network (VPC)
  2. One source of truth for service discovery (e.g. DNS or Cloud Map)
  3. App Mesh resources to be defined within the context of the cluster they are operating within.

Current yaml in cluster one, produces Error creating: admission webhook "mpod.appmesh.k8s.aws" denied the request: sidecarInject enabled but no matching VirtualNode found

I think this is a mismatch between the version of the controller you're using and the example you've pulled from. In the v1, generally available version of the controller, annotations are no longer used on the Deployment resource in favor of selectors on the Virtual Nodes / Virtual Gateways. Per (3) above, you'll want to make sure the App Mesh resources you're defining are applied via kubectl to the Cluster they'll run on (so the selectors also know how to find the right pods to inject Envoy into). I see you've opened another issue regarding the cross-account/cluster demo, so it would be good to do any follow-up questions or concerns on that issue.

benpettman commented 4 years ago

Hi @bcelenza i am not using a mismatched version, I am using v1 and the annotations are there in a vain attempt to get this working.

The documentation neither says use or not use annotations so I am clutching at straws.

Even without the annotations on the deployment I still get the same error.

Are you saying the mesh and virtual nodes need to be installed in both clusters? If so how does that work as the virtual node has a backend ref to a virtual service that runs in the other cluster, so it’s like pulling a thread, if I deploy the VN I would also need to deploy the VS and then the entire deployment for what I’m trying to use which then negates the point in a multi cluster approach.

This is my reason for opening the other issue, as a lot of the other examples were upgraded to the latest but that one wasn’t.

So I’m seemingly struggling to get this to work and need to get it working or find another solution.

bcelenza commented 4 years ago

Sorry, my suggestion was just to move diagnosing and fixing the cross-cluster issue over to aws/aws-app-mesh-examples#314 since this issue was cut with regards to Cloud Map service discovery. I'd like to close this issue out and keep it about Cloud Map as much as possible in case others come to search for similar things.

Would you mind if we continue the conversation about cross-cluster on your other issue?

bcelenza commented 4 years ago

Closing this issue. Please re-open if you need any more help with Cloud Map service discovery.