aws-controllers-k8s / community

AWS Controllers for Kubernetes (ACK) is a project enabling you to manage AWS services from Kubernetes
https://aws-controllers-k8s.github.io/community/
Apache License 2.0
2.41k stars 254 forks source link

Circular dependency while creating RouteTable and a VPCEndpoint #1817

Open nishant221 opened 1 year ago

nishant221 commented 1 year ago

Describe the bug There is a circular dependency while creating a new RouteTable and a VPCEndpoint. VPCEndpoint can't be created without attaching it to a RouteTable and a RouteTable can't be created without defining VPCEndpoint that it will attach to.

Steps to reproduce The YAML to create VPC,RouteTable & VPCEndpoint with circular dependency:

apiVersion: ec2.services.k8s.aws/v1alpha1
kind: VPC
metadata:
  name: circular-dependency-vpc
spec:
  cidrBlocks:
  - 10.10.0.0/16
  tags:
    - key: Name
      value: circular-dependency-vpc
---
apiVersion: ec2.services.k8s.aws/v1alpha1
kind: RouteTable
metadata:
  name: circular-dependency-rtb
spec:
  routes:
    - vpcEndpointRef:
        from:
          name:  circular-dependency-vpc-endpoint-s3
  vpcRef:
    from:
      name: circular-dependency-vpc
---
apiVersion: ec2.services.k8s.aws/v1alpha1
kind: VPCEndpoint
metadata:
  name:  circular-dependency-vpc-endpoint-s3
spec:
  serviceName: com.amazonaws.us-west-2.s3
  vpcRef:
    from:
      name: circular-dependency-vpc
  routeTableRefs:
  - from:
      name: circular-dependency-rtb
  tags:
    - key: Name
      value:  circular-dependency-vpc-endpoint-s3

RouteTable and VPCEndpoint creation will stuck waiting for each other. Output below

$ kubectl get VPCEndpoint circular-dependency-vpc-endpoint-s3 -o json | jq .status
{
  "conditions": [
    {
      "lastTransitionTime": "2023-06-07T10:34:42Z",
      "message": "Reference resolution failed",
      "reason": "the referenced resource is not synced yet. resource:RouteTable, namespace:default, name:circular-dependency-rtb",
      "status": "Unknown",
      "type": "ACK.ReferencesResolved"
    }
  ]
}

$ kubectl get routetables circular-dependency-rtb -o json | jq .status
{
  "conditions": [
    {
      "lastTransitionTime": "2023-06-07T10:34:41Z",
      "message": "Reference resolution failed",
      "reason": "the referenced resource is not synced yet. resource:VPCEndpoint, namespace:default, name:circular-dependency-vpc-endpoint-s3",
      "status": "Unknown",
      "type": "ACK.ReferencesResolved"
    }
  ]
}

Expected outcome I would expect the RouteTable and VPCEndpoint to be created.

Environment

Kubernetes version v1.24.13-eks-0a21954 Using EKS (yes/no), if so version? yes AWS service targeted (S3, RDS, etc.) ec2-controller v1.0.2

RedbackThomson commented 1 year ago

Ah I knew this would happen some day. The tricky part is that we have no method of doing a "partial create" - that is, we can't create the RouteTable without also creating its associated routes.

Could you help me understand this use case a bit - I haven't worked with PrivateLink much before. Why do we need to provide a list of route tables to the VPCEndpoint resource?

mtougeron commented 1 year ago

An example is creating the endpoint to S3. https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-s3.html If this is the wrong approach/method to do this in the ACK please let us know. We're also happy to contribute if this is something that needs to be created in a different controller or something.

The "partial create" was the sticking point we ran into and why we didn't already try to commit a patch. :(

mtougeron commented 1 year ago

I should clarify, by "we" I mean @nishant221 & I (or more appropriately someone at Adobe)...

RedbackThomson commented 1 year ago

From the tutorial you linked:

For Route tables, select the route tables to be used by the endpoint. We automatically add a route that points traffic destined for the service to the endpoint network interface.

So it sounds like the VPCEndpoint configuration would create the route in the RouteTable object. However, that type of configuration isn't compatible with ACK because it would be modifying the spec of RouteTable outside the K8s context. Instead, I would recommend that you add the route in the RouteTable spec and you exclude the reference to it from the VPCEndpoint. Hopefully that works

mtougeron commented 1 year ago

Okay, after some testing, the correct manner in which to do this is to add the reference in the VPCEndpoint and it will automatically be added to the RouteTable. Unfortunately the reconcile process then tries to remove it from the RouteTable rules resulting in the inability to make any additional changes to the routes.

2023-06-12T20:32:26.350Z    ERROR   Reconciler error    {"controller": "routetable", "controllerGroup": "ec2.services.k8s.aws", "controllerKind": "RouteTable", "RouteTable": {"name":"tgetst-sbx-or2-k8s-private-a","namespace":"sbx-clusters"}, "namespace": "sbx-clusters", "name": "tgetst-sbx-or2-k8s-private-a", "reconcileID": "3ad89936-d161-4095-ac2e-8ed82e5a93e1", "error": "InvalidParameterValue: cannot remove VPC endpoint route pl-68a54001 in route table rtb-07fd763182a66e95c\n\tstatus code: 400, request id: 6eb005b8-2b91-4913-acdb-8d86e93cdb3a"}

What do you think of modifying the RouteTable reconcile process to ignore these routes if the DestinationPrefixListId has an OwnerId of AWS? e.g.,

$> aws ec2 describe-managed-prefix-lists --prefix-list-ids pl-68a54001 --region us-west-2
{
    "PrefixLists": [
        {
            "PrefixListId": "pl-68a54001",
            "AddressFamily": "IPv4",
            "State": "create-complete",
            "PrefixListArn": "arn:aws:ec2:us-west-2:aws:prefix-list/pl-68a54001",
            "PrefixListName": "com.amazonaws.us-west-2.s3",
            "Tags": [],
            "OwnerId": "AWS"
        }
    ]
}
RedbackThomson commented 1 year ago

Okay, after some testing, the correct manner in which to do this is to add the reference in the VPCEndpoint and it will automatically be added to the RouteTable. Unfortunately the reconcile process then tries to remove it from the RouteTable rules resulting in the inability to make any additional changes to the routes.

This is exactly why I recommended we add the route manually to the RouteTable rather than adding it as a configuration of VPCEndpoint.

RedbackThomson commented 1 year ago

What do you think of modifying the RouteTable reconcile process to ignore these routes if the DestinationPrefixListId has an OwnerId of AWS? e.g.,

That seems like a valid change - any route created by AWS isn't configurable to the user. Unfortunately for now, we don't have a way to partially patch a list or a map within the custom resource. This is a limitation of the underlying library that all controllers use to update K8s resources. So we'd need a custom bit of logic to determine the difference and apply the changes with a full replacement.

mtougeron commented 1 year ago

This is exactly why I recommended we add the route manually to the RouteTable rather than adding it as a configuration of VPCEndpoint.

yeah, unfortunately AWS doesn't allow it to happen that way. Even the portal/cli doesn't allow the association to happen from the RouteTable to the VPCEndpoint.

So we'd need a custom bit of logic to determine the difference and apply the changes with a full replacement.

yup, that's what I was thinking I'd add. I'll see about what I can do to put in a patch for it.

Thanks for your input/assistance!

a-hilaly commented 1 year ago

@nishant221 can we close this issue? i see you sent a fix patch

ack-bot commented 9 months ago

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

gecube commented 7 months ago

/remove-lifecycle stale

ack-bot commented 1 month ago

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

gecube commented 1 month ago

/remove-lifecycle stale