radius-project / radius

Radius is a cloud-native, portable application platform that makes app development easier for teams building cloud-native apps.
https://radapp.io
Apache License 2.0
1.48k stars 94 forks source link

Flaky test: Test_AWS_S3Bucket_Existing #7996

Open kachawla opened 1 week ago

kachawla commented 1 week ago

Steps to reproduce

The test Test_AWS_S3Bucket_Existing failed during a scheduled run: https://github.com/radius-project/radius/actions/runs/11289113473/job/31398477292. Not sure how we can reproduce it, but we should look into the code path related to the creation of AWS resources via UCP, and re-evaluate if there is a read after write operation we are performing which needs to be more resilient through retries.

Observed behavior

Test_AWS_S3Bucket_Existing failed at the execution of first step that deploys a bicep file to create a new S3 bucket: aws-s3-bucket.bicep. The error it returned was bucket NotFound, which doesn't make sense since the operation is creating the resource.

Error logs -

"target": "/planes/aws/aws/accounts/***/regions/us-west-2/providers/AWS.S3/Bucket/radiusfunctionaltestbucket-add9d5d6-80c6-4683-98c9-d1f914a9b272"
    cli.go:341: [rad]     }
    cli.go:341: [rad]   ]
    cli.go:341: [rad] }
    cli.go:341: [rad] 
    cli.go:341: [rad] TraceId:  dc92927587eb7f4b6c4adaadc0b85914
    cli.go:341: [rad] 
    cli.go:341: [rad] 
    deployexecutor.go:83: 
            Error Trace:    /home/runner/work/radius/radius/test/step/deployexecutor.go:83
                                        /home/runner/work/radius/radius/test/rp/rptest.go:392
            Error:          Received unexpected error:
                            code DeploymentFailed: err At least one resource deployment operation failed. Please see the details for the specific operation that failed.
            Test:           Test_AWS_S3Bucket_Existing/deploy_testdata/aws-s3-bucket.bicep
            Messages:       failed to deploy deploy testdata/aws-s3-bucket.bicep
    --- FAIL: Test_AWS_S3Bucket_Existing/deploy_testdata/aws-s3-bucket.bicep (45.83s)

Test run and its artifacts: https://github.com/radius-project/radius/actions/runs/11289113473

Desired behavior

The test should always pass.

Workaround

It passed on subsequent runs.

rad Version

N/A

Operating system

No response

Additional context

No response

Would you like to support us?

AB#13463

radius-triage-bot[bot] commented 1 week ago

:wave: @kachawla Thanks for filing this bug report.

A project maintainer will review this report and get back to you soon. If you'd like immediate help troubleshooting, please visit our Discord server.

For more information on our triage process please visit our triage overview

kachawla commented 1 week ago

Same issue happened for this test failure as well: https://github.com/radius-project/radius/actions/runs/11303270081/job/31440186010. This one is for Test_AWSRedeployWithUpdatedResourceUpdatesResource, which also creates an S3 bucket.

radius-triage-bot[bot] commented 1 week ago

:+1: We've reviewed this issue and have agreed to add it to our backlog. Please subscribe to this issue for notifications, we'll provide updates when we pick it up.

We also welcome community contributions! If you would like to pick this item up sooner and submit a pull request, please visit our contribution guidelines and assign this to yourself by commenting "/assign" on this issue.

For more information on our triage process please visit our triage overview

radius-triage-bot[bot] commented 1 week ago

:+1: We've reviewed this issue and have agreed to add it to our backlog. Please subscribe to this issue for notifications, we'll provide updates when we pick it up.

We also welcome community contributions! If you would like to pick this item up sooner and submit a pull request, please visit our contribution guidelines and assign this to yourself by commenting "/assign" on this issue.

For more information on our triage process please visit our triage overview