aws-controllers-k8s / community

AWS Controllers for Kubernetes (ACK) is a project enabling you to manage AWS services from Kubernetes
https://aws-controllers-k8s.github.io/community/
Apache License 2.0
2.42k stars 255 forks source link

InvalidSubnet error when CIDR range is available in AWS #1895

Open mattzech opened 1 year ago

mattzech commented 1 year ago

Describe the bug Creating a subnet with a confirmed available CIDR range results in the following error message in the status of the subnet manifest:

2023-09-12T15:20:36.885Z        ERROR   Reconciler error        {"controller": "subnet", "controllerGroup": "ec2.services.k8s.aws", "controllerKind": "Subnet", "Subnet": {"name":"test-vpc-public-eu-west-1b","namespace":"netgen-test"}, "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "reconcileID": "9c90fa2c-6503-4dd2-a222-e223422db5e2", "error": "InvalidSubnet.Conflict: The CIDR '30.0.0.64/26' conflicts with another subnet\n\tstatus code: 400, request id: 5c08f6eb-1ee1-4645-84ef-208c8edc36b8"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:235
2023-09-12T15:20:36.889Z        DEBUG   exporter.field-export-reconciler        error did not need requeue      {"error": "the source resource is not synced yet"}

Despite this error message, the subnet was successfully created in AWS.

Steps to reproduce Happens randomly and with different subnet CIDR ranges each time

Expected outcome Expect all to be created successfully and with the subnet id in the status

Environment

a-hilaly commented 1 year ago

Thank you for reporting this @mattzech. Do you have some previous log describing what was the change that triggered an update call? those might be visible if you enable developmentLogging and set logging level to debug. /cc @aws-controllers-k8s/ec2-maintainer

mattzech commented 1 year ago

Hi @a-hilaly, Thanks for the response. This was an error message on creation of the resource, so no triggering update (to my knowledge). I can look into setting the log level to see if I can find more info for you

mattzech commented 1 year ago

These are the logs leading up to the error, not sure if you know what to make of this @a-hilaly

2023-09-12T16:56:15.232Z    DEBUG   ackrt   > r.Sync    {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "generation": 3}
2023-09-12T16:56:15.232Z    DEBUG   ackrt   >> r.resetConditions    {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "generation": 3}
2023-09-12T16:56:15.232Z    DEBUG   ackrt   << r.resetConditions    {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "generation": 3}
2023-09-12T16:56:15.232Z    DEBUG   ackrt   >> rm.ResolveReferences {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   << rm.ResolveReferences {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   >> rm.EnsureTags    {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   << rm.EnsureTags    {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   >> rm.ReadOne   {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   >>> rm.sdkFind  {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   <<< rm.sdkFind  {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3, "error": "resource not found"}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   << rm.ReadOne   {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3, "error": "resource not found"}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   >> r.createResource {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   >>> rm.Create   {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.240Z    DEBUG   ackrt   >>>> rm.sdkCreate   {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.310Z    DEBUG   ackrt   <<<< rm.sdkCreate   {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3, "error": "InvalidSubnet.Conflict: The CIDR '30.0.0.64/26' conflicts with another subnet\n\tstatus code: 400, request id: dde03739-2a54-4d74-8842-dc11d703f078"}
2023-09-12T16:56:15.310Z    DEBUG   ackrt   <<< rm.Create   {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3, "error": "InvalidSubnet.Conflict: The CIDR '30.0.0.64/26' conflicts with another subnet\n\tstatus code: 400, request id: dde03739-2a54-4d74-8842-dc11d703f078"}
2023-09-12T16:56:15.310Z    DEBUG   ackrt   << r.createResource {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3, "error": "InvalidSubnet.Conflict: The CIDR '30.0.0.64/26' conflicts with another subnet\n\tstatus code: 400, request id: dde03739-2a54-4d74-8842-dc11d703f078"}
2023-09-12T16:56:15.310Z    DEBUG   ackrt   >> r.ensureConditions   {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
2023-09-12T16:56:15.310Z    DEBUG   ackrt   >>> rm.IsSynced {"account": "****************", "role": "arn:aws:iam::****************:role/gkop-ack-controllers", "region": "eu-west-1", "kind": "Subnet", "namespace": "netgen-test", "name": "test-vpc-public-eu-west-1b", "is_adopted": false, "generation": 3}
a-hilaly commented 1 year ago

@mattzech Thank you for providing more logs/info. If understand well the API returns a validation error but still creates the subnet? If that's the case then we'll have to tweak error handling in the create call.

a-hilaly commented 1 year ago

Is the subnet still operational even if it's CIDR conflicts with another subnet? quoting "InvalidSubnet.Conflict: The CIDR '30.0.0.64/26' conflicts with another subnet"

mattzech commented 1 year ago

@a-hilaly Yes, I definitely agree that it seems like more of an error handling issue. The subnet is operational in AWS and the creation works fine if I manually create it in the console

ack-bot commented 8 months ago

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

gecube commented 8 months ago

/remove-lifecycle stale

ack-bot commented 2 months ago

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

gecube commented 2 months ago

/remove-lifecycle stale