aws / aws-app-mesh-roadmap

AWS App Mesh is a service mesh that you can use with your microservices to manage service to service communication
Apache License 2.0
347 stars 25 forks source link

Virtual Gateway name must be unique across the EKS cluster #316

Closed oztimpower closed 3 years ago

oztimpower commented 3 years ago

SECURITY NOTICE: If you think you’ve found a potential security issue, please do not post it in the Issues. Instead, please follow the instructions here or email AWS security directly.

Summary What are you observing that doesn't seem right?

When I was applying K8s manifests to create the virtual gateway and gateway routes, I would experience a duplicate virtual gateway (found 2 instead of 1) error from the web hook, as below. For my namespace I had only one virtual gateway of this name (obviously...), but I did have one with the same name in another namespace.

Error from server (found multiple matching virtualGateways for namespace: trade, expecting 1 but found 2: gw,gw): error when creating "src/main/kubernetes/mesh.yml": admission webhook "mgatewayroute.appmesh.k8s.aws" denied the request: found multiple matching virtualGateways for namespace: trade, expecting 1 but found 2: gw,gw

Steps to Reproduce What are the steps you can take to reproduce this issue?

Create a namespace, create a mesh, and virtual gateway, and gateway routes etc. All unique names, and this will work, same as the aws-samples/howto-k8s-ingress-gateway.

In a 2nd namespace, when you use the same name for the virtual gateway, and try to create a gateway route, the web hook (above) will generate this duplicate lookup issue, despite the error message saying it is looking in a particular namespace, the lookup appears to find all occurrences across all namespaces. The error is trigger when creating gateway routes. The creation of a virtual gateway with the same name in another namespace works fine (and creates a virtual gateway as name of virtual gateway_namespace); the lookup however doesn't appear to honour this, when creating gateway routes. It is the gateway route that fails with the above error.

Are you currently working around this issue? How are you currently solving this problem?

Additional context Anything else we should know?

More helpful error messages please! When you get 0 virtual gateways found, it should check if the namespace has the label gateway=mygateway or not. When you get the duplicate as above, the error message should also show the fully qualified name, or the AWS name so you could at least quickly resolve such issues.

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

rajal-amzn commented 3 years ago

AppMesh controller auto-generates awsName based on Namespace name + resource name and register that with AppMesh. Unless you are defining the awsName in the CRD, this should not happen.

@oztimpower Could you share the CRD you used to create the Virtual Gateways here? That would help us reproduce the issue on our end.

oztimpower commented 3 years ago

Issue is not with the fact that the VGW name can be unique across namespaces, it's the lookup done by the web hook when applying a gateway route that's the issue. Yes, you can create VGWs of the same name in multiple namespaces because the AWS name = gwname_namespace. However the web hook lookup, doesn't honour this and just looks up VGW name (without the namespace appended), not the AWS name which is concatenated gwname_namespace. The error message is explicit that is sees 2 GWs, one in namespace 1, the other in namespace 2.

CRD is based on this version, installed per instructions and unmodified for the VirtualGateway and GatewayRoute.

"apiVersion": "apiextensions.k8s.io/v1beta1",
oztimpower commented 3 years ago

From the AppMesh walkthroughs (Color app), you can use the YML: https://github.com/aws/aws-app-mesh-examples/blob/master/walkthroughs/howto-k8s-ingress-gateway/v1beta2/manifest.yaml.template

Deploy this to more than 1 namespace, the first time it will pass, in the second namespace it will fail on gateway route due to the presence of 2 VGWs, one in each namespace.

Steps to recreate:

  1. Create Fargate Profile, with two namespaces: demo & demo1
  2. Create environment variables used by the YML (or manual replace)
    export APP_NAMESPACE=demo
    export MESH_NAME=$APP_NAMESPACE

    Replace the envs in the YML file.

Or per the deploy.sh automation:

eval "cat <<EOF
$(<${DIR}/${MANIFEST_VERSION}/manifest.yaml.template)
EOF
" >${EXAMPLES_OUT_DIR}/manifest.yaml
  1. kubectl apply -f myfile

Apply the below YML (from the above link, AWS walkthroughs). You only need the declarations Namespace, Mesh, VGW and GW Route, so can skip the rest (VN, VR, SVC, Deploy).

  1. See the GW Route succeed for the 1st namespace, and fail for the second namespace: repeat for ns = demo & demo1

  2. Below error will be generated when trying to create the GW Route:

    Error from server (found multiple matching virtualGateways for namespace: demo1, expecting 1 but found 2: ingress-gw,ingress-gw): error when creating "/tmp/gwr-demo.yml": admission webhook "mgatewayroute.appmesh.k8s.aws" denied the request: found multiple matching virtualGateways for namespace: demo1, expecting 1 but found 2: ingress-gw,ingress-gw

YML from AWS walkthrough:

---
apiVersion: v1
kind: Namespace
metadata:
  name: ${APP_NAMESPACE}
  labels:
    mesh: ${MESH_NAME}
    gateway: ingress-gw
    appmesh.k8s.aws/sidecarInjectorWebhook: enabled
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: Mesh
metadata:
  name: ${MESH_NAME}
spec:
  namespaceSelector:
    matchLabels:
      mesh: ${MESH_NAME}
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualGateway
metadata:
  name: ingress-gw
  namespace: ${APP_NAMESPACE}
spec:
  namespaceSelector:
    matchLabels:
      gateway: ingress-gw
  podSelector:
    matchLabels:
      app: ingress-gw
  listeners:
    - portMapping:
        port: 8088
        protocol: http
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: GatewayRoute
metadata:
  name: gateway-route-headers
  namespace: ${APP_NAMESPACE}
spec:
  httpRoute:
    match:
      prefix: "/headers"
    action:
      target:
        virtualService:
          virtualServiceRef:
            name: color-headers
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: GatewayRoute
metadata:
  name: gateway-route-paths
  namespace: ${APP_NAMESPACE}
spec:
  httpRoute:
    match:
      prefix: "/paths"
    action:
      target:
        virtualService:
          virtualServiceRef:
            name: color-paths
achevuru commented 3 years ago

@oztimpower Error log appears to complain about 2 VGs in namespace "demo1". Did you check the resources under ns demo1?

found multiple matching virtualGateways for namespace: demo1, expecting 1 but found 2: ingress-gw,ingress-gw.

oztimpower commented 3 years ago

Dude, there is one ingress-gw in namespace demo, and one in namespace demo1, i.e. AwsName=ingress-gw_demo, AwsName=ingress-gw_demo1.

There is a bug in the lookup.......jeepers.

achevuru commented 3 years ago

As discussed, GatewayRoute->VirtualGateway matching is done via namespace selectors and is the reason why VirtualGateway spec portion has "namespace selector" option. So, essentially it found 2 different VirtualGateways under namespaces with the same namespace selector resulting in the above issue.

https://github.com/aws/aws-app-mesh-examples/blob/master/walkthroughs/howto-k8s-ingress-gateway/v1beta2/manifest.yaml.template#L26 https://github.com/aws/aws-app-mesh-controller-for-k8s/blob/master/pkg/virtualgateway/membership_designator.go#L89

achevuru commented 3 years ago

We will enhance the docs to make it more obvious and clear.

oztimpower commented 3 years ago

Thanks, that would be very beneficial, since there is no indication (or examples) on the documentation for EKS. Validation to prevent multiple GWs with the same name would make sense also, since the corresponding namespace selector will always fail?

As an aside, I do find it illogical that you must specify the gateway on the Namespace (particularly if it's not selected, i.e. my GW is in this namespace, why are you considering other namespaces?).

The CRD spec GatewayRoute would benefit from changing the GW property from read-only to writeable (optional) so that it can be specific. IMHO it makes far more sense to name the GW on the GWR rather than rely on some magic selector that is doing something very different to what is expected (i.e. prioritising the GW lookup on the specific namespace, and ignore other namespaces, on first try, if none in the namespace, look at the other namespaces).