radius-project / radius

Radius is a cloud-native, portable application platform that makes app development easier for teams building cloud-native apps.
https://radapp.io
Apache License 2.0
1.46k stars 93 forks source link

Test_AWS_S3Bucket_Existing - Cannot find resource #5963

Open rad-ci-bot opened 1 year ago

rad-ci-bot commented 1 year ago

Bug information

This bug is generated automatically if the scheduled functional test fails. The Radius functional test operates on a schedule of every 4 hours during weekdays and every 12 hours over the weekend. It's important to understand that the test may fail due to workflow infrastructure issues, like network problems, rather than the flakiness of the test itself. For the further investigation, please visit here.

AB#8828

youngbupark commented 1 year ago

AWS S3 Bucket failure while deleting resources.

cc/ @willdavsmith Aren't we using the existing s3 resource? it said that it could not find S3 Bucket resource.

2023/07/30 01:08:13 Start streaming Kubernetes logs - Pod ctnr-cmd-args-7ddf957547-r4pcc is in state: Running
    cli.go:382: [rad] Building /usr/local/vss-agent/2.307.1/_work/radius/radius/test/functional/shared/resources/testdata/aws-s3-bucket.bicep...
    cli.go:382: [rad] Deploying template '/usr/local/vss-agent/2.307.1/_work/radius/radius/test/functional/shared/resources/testdata/aws-s3-bucket.bicep' into environment 'kind-radius' from workspace 'kind-radius'...
    cli.go:382: [rad] 
    cli.go:382: [rad] Deployment In Progress...
    cli.go:382: [rad] 
    cli.go:382: [rad] Error: {
    cli.go:382: [rad]   "code": "DeploymentFailed",
    cli.go:382: [rad]   "message": "At least one resource deployment operation failed. Please see the details for the specific operation that failed.",
    cli.go:382: [rad]   "details": [
    cli.go:382: [rad]     {
    cli.go:382: [rad]       "code": "NotFound",
    cli.go:382: [rad]       "message": "Resource /planes/aws/aws/accounts/***/regions/us-west-2/providers/AWS.S3/Bucket with primary identifiers radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba not found"
    cli.go:382: [rad]     }
    cli.go:382: [rad]   ]
    cli.go:382: [rad] }
    cli.go:382: [rad] 
    cli.go:382: [rad] TraceId:  d028468872aeb814727f5c36fc6e9f82
    cli.go:382: [rad] 
    cli.go:382: [rad] 
    deployexecutor.go:86: 
            Error Trace:    /usr/local/vss-agent/2.307.1/_work/radius/radius/test/step/deployexecutor.go:86
                                        /usr/local/vss-agent/2.307.1/_work/radius/radius/test/functional/shared/rptest.go:263
            Error:          Received unexpected error:
                            code DeploymentFailed: err At least one resource deployment operation failed. Please see the details for the specific operation that failed.
            Test:           Test_AWS_S3Bucket_Existing/deploy_testdata/aws-s3-bucket.bicep
            Messages:       failed to deploy deploy testdata/aws-s3-bucket.bicep
    --- FAIL: Test_AWS_S3Bucket_Existing/deploy_testdata/aws-s3-bucket.bicep (33.23s)

=== FAIL: test/functional/shared/resources Test_AWS_S3Bucket_Existing (88.85s)
    test.go:38: Using container registry: ghcr.io/project-radius/dev - set DOCKER_REGISTRY to override
    test.go:39: Using container tag: pr-3df6d5b241 - set REL_VERSION to override
    test.go:40: Using magpie image: ghcr.io/project-radius/dev/magpiego:pr-3df6d5b241
    test.go:44: Using recipe registry: radiusdev.azurecr.io - set BICEP_RECIPE_REGISTRY to override
    test.go:45: Using recipe tag: pr-3df6d5b241 - set BICEP_RECIPE_TAG_VERSION to override
    test.go:48: Using terraform recipe module server URL: http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/ - set TF_RECIPE_MODULE_SERVER_URL to override
    test.go:58: Loaded workspace: kind-radius (Kubernetes (context=kind-radius))
    rptest.go:309: beginning cleanup phase of radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba
    rptest.go:318: deleting radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba
    rptest.go:323: finished deleting radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba
    rptest.go:328: validating deletion of AWS resource for radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba (attempt 1/5)
    rptest.go:328: validating deletion of AWS resource for radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba (attempt 2/5)
    rptest.go:328: validating deletion of AWS resource for radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba (attempt 3/5)
    rptest.go:328: validating deletion of AWS resource for radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba (attempt 4/5)
    rptest.go:328: validating deletion of AWS resource for radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba (attempt 5/5)
    rptest.go:344: 
            Error Trace:    /usr/local/vss-agent/2.307.1/_work/radius/radius/test/functional/shared/rptest.go:344
            Error:          Should be true
            Test:           Test_AWS_S3Bucket_Existing
            Messages:       AWS resource radiusfunctionaltestbucket-b58ec2bf-6beb-4b93-afbb-6d6ec32368ba was present, should be not found

DONE 191 tests, 6 skipped, 2 failures in 610.552s
willdavsmith commented 1 year ago

Going to close this for now, I think this PR will fix: https://github.com/project-radius/radius/pull/6035

I will keep an eye on these test failures since I'm on-call this week. Will re-open if necessary

ytimocin commented 8 months ago

Recent examples:

sk593 commented 7 months ago

The tests seem to be failing again after re-enabling AWS tests on this PR: Restore AWS S3 tests (#6993) · radius-project/radius@fcd096e (github.com)

kachawla commented 7 months ago

@willdavsmith do we have any insights into why this issue started happening again after over 6 months of original fix? Is this a regression?

willdavsmith commented 7 months ago

@willdavsmith do we have any insights into why this issue started happening again after over 6 months of original fix? Is this a regression?

Interestingly, this is only failing on the long-running tests, not the scheduled functional tests. So I don't think that it's a regression. I'm looking into it to see what's happening, it's definitely flaky

kachawla commented 7 months ago

Interestingly, this is only failing on the long-running tests, not the scheduled functional tests. So I don't think that it's a regression. I'm looking into it to see what's happening, it's definitely flaky

Thanks @willdavsmith. Removed the year old triaged level to bubble this back up in the backlog queue.

radius-triage-bot[bot] commented 6 months ago

:+1: We've reviewed this issue and have agreed to add it to our backlog. Please subscribe to this issue for notifications, we'll provide updates when we pick it up.

We also welcome community contributions! If you would like to pick this item up sooner and submit a pull request, please visit our contribution guidelines and assign this to yourself by commenting "/assign" on this issue.

For more information on our triage process please visit our triage overview

willdavsmith commented 6 months ago

Are we still seeing this issue? I see the long-running test passing: https://github.com/radius-project/radius/actions/workflows/long-running-azure.yaml

vishwahiremat commented 6 months ago

@willdavsmith we hit this issue again yesterday https://github.com/radius-project/radius/issues/7334

nithyatsu commented 6 months ago

@willdavsmith , we hit the issue with #7343 too. We see the same message but with Test_Extender_RecipeAWS.

"message": "Resource /planes/aws/aws/accounts/***/regions/us-west-2/providers/AWS.S3/Bucket with primary identifiers radiusfunctionaltestbucket-aadb5529-9bae-4381-a37e-be74a608e831 not found","