pulumi / pulumi-aws-native

AWS Native Provider for Pulumi
Apache License 2.0
94 stars 17 forks source link

AWS::EC2::SubnetRouteTableAssociation creation is flaky #1714

Closed corymhall closed 1 month ago

corymhall commented 1 month ago

What happened

The AWS::EC2::SubnetRouteTableAssociation creation seems to be flaky. When a resource is created a subsequent GetResource call is made to populate the attributes of the resource. Sometimes when creating a SubnetRouteTableAssociation it will fail on the GetResource call.

 error: creating resource: reading resource state: operation error CloudControl: GetResource, https response error StatusCode: 400, RequestID: 6e068a90-5b8a-437a-9041-3863ca915c87, ResourceNotFoundException: AWS::EC2::SubnetRouteTableAssociation Handler returned status FAILED: No route tables Found with association rtbassoc-057e9d978d69d1620 (HandlerErrorCode: NotFound, RequestToken: 9dd80ab7-03d2-40fb-8c4f-52b50e19354f)

On a subsequent up the error goes away and the deployment completes successfully.

Example

This can be seen sometimes in the pulumi-cdk tests which create a Vpc.

Example tests

flostadler commented 1 month ago

I'm quite sure we're getting hit by eventual consistency here.

Right now we're doing the following in code:

Depending on how the service implements reads (IIRC almost all are eventually consistent), we're going to run into issues every now and then. My guess is that Networking related resources and IAM related resources are affected the most by this as they have the highest propagation delay IIRC.

What we should do here is add a retry with backoff mechanism in the case the CC operation was successful but the subsequent Read call returns a 404.

corymhall commented 1 month ago

Just realized that we already had an issue for this https://github.com/pulumi/pulumi-aws-native/issues/1186

pulumi-bot commented 1 month ago

Cannot close issue:

Please fix these problems and try again.