Open norshtein opened 6 years ago
Another case: pipeline in #616 passed but pipeline on master branch failed : https://circleci.com/gh/Azure/open-service-broker-azure/5923?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link
I reproduced this error successfully. The error is caused by exclusive lock conflict. Below is the error message:
{"code":"PreconditionFailed","message":"There is already an operation in progress which requires exlusive lock on this service 8cd9c40b-02c3-47d0-86d2-141835c04066. Please retry the operation after sometime.\r\nActivityId: 72665989-0961-4815-8f7c-8b7607d131d6, Microsoft.Azure.Documents.Common/2.0.0.0"}
And the call stack of the error:
runtime/debug.Stack(0xc4207b06e8, 0x2, 0x2)
/usr/local/go/src/runtime/debug/stack.go:24 +0xa7
github.com/Azure/open-service-broker-azure/vendor/github.com/Azure/go-autorest/autorest/azure.(*Future).Done(0xc4207b0ad8, 0xc42380, 0xc4207b34d0, 0x0, 0x0, 0x0)
/go/src/github.com/Azure/open-service-broker-azure/vendor/github.com/Azure/go-autorest/autorest/azure/async.go:119 +0x675
github.com/Azure/open-service-broker-azure/vendor/github.com/Azure/azure-sdk-for-go/services/cosmos-db/mgmt/2015-04-08/documentdb.DatabaseAccountsClient.DeleteSender(0xc412a0, 0xc42044cd60, 0xc41840, 0xc42037efc0, 0x0, 0x0, 0xdf8475800, 0x1a3185c5000, 0x3, 0x6fc23ac00, ...)
/go/src/github.com/Azure/open-service-broker-azure/vendor/github.com/Azure/azure-sdk-for-go/services/cosmos-db/mgmt/2015-04-08/documentdb/databaseaccounts.go:279 +0x2d5
github.com/Azure/open-service-broker-azure/vendor/github.com/Azure/azure-sdk-for-go/services/cosmos-db/mgmt/2015-04-08/documentdb.DatabaseAccountsClient.Delete(0xc412a0, 0xc42044cd60, 0xc41840, 0xc42037efc0, 0x0, 0x0, 0xdf8475800, 0x1a3185c5000, 0x3, 0x6fc23ac00, ...)
/go/src/github.com/Azure/open-service-broker-azure/vendor/github.com/Azure/azure-sdk-for-go/services/cosmos-db/mgmt/2015-04-08/documentdb/databaseaccounts.go:240 +0x67b
github.com/Azure/open-service-broker-azure/pkg/services/cosmosdb.deleteCosmosDBAccount(0xc46880, 0xc4200657c0, 0xc412a0, 0xc42044cd60, 0xc41840, 0xc42037efc0, 0x0, 0x0, 0xdf8475800, 0x1a3185c5000, ...)
/go/src/github.com/Azure/open-service-broker-azure/pkg/services/cosmosdb/common_deprovision.go:37 +0x1b4
github.com/Azure/open-service-broker-azure/pkg/services/cosmosdb.(*cosmosAccountManager).deleteCosmosDBAccount(0xc420402420, 0xc46880, 0xc420065780, 0x0, 0x0, 0x0, 0x0, 0xbc51b9, 0x24, 0xc4c0c0, ...)
/go/src/github.com/Azure/open-service-broker-azure/pkg/services/cosmosdb/common_deprovision.go:87 +0xe8
github.com/Azure/open-service-broker-azure/pkg/services/cosmosdb.(*cosmosAccountManager).(github.com/Azure/open-service-broker-azure/pkg/services/cosmosdb.deleteCosmosDBAccount)-fm(0xc46880, 0xc420065780, 0x0, 0x0, 0x0, 0x0, 0xbc51b9, 0x24, 0xc4c0c0, 0xc42054fc20, ...)
/go/src/github.com/Azure/open-service-broker-azure/pkg/services/cosmosdb/common_deprovision.go:64 +0x79
github.com/Azure/open-service-broker-azure/pkg/service.(*deprovisioningStep).Execute(0xc42044aaa0, 0xc46880, 0xc420065780, 0x0, 0x0, 0x0, 0x0, 0xbc51b9, 0x24, 0xc4c0c0, ...)
/go/src/github.com/Azure/open-service-broker-azure/pkg/service/deprovisioner.go:67 +0xe5
github.com/Azure/open-service-broker-azure/tests/lifecycle.serviceLifecycleTestCase.execute(0xbaef20, 0x8, 0xbb90a8, 0x14, 0xbc51b9, 0x24, 0xbc5345, 0x24, 0xc420087dd0, 0xc420087e00, ...)
/go/src/github.com/Azure/open-service-broker-azure/tests/lifecycle/test_case_test.go:315 +0x136e
github.com/Azure/open-service-broker-azure/tests/lifecycle.TestServices.func1.1(0xc4203cc2d0)
/go/src/github.com/Azure/open-service-broker-azure/tests/lifecycle/driver_test.go:45 +0xe2
testing.tRunner(0xc4203cc2d0, 0xc420392140)
/usr/local/go/src/testing/testing.go:777 +0xd0
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:824 +0x2e0
I have no idea on why this error happened. I think this might be an issue in internal API. I'll request for help from Azure Cosmos team.
Below is my guess on the error: When running lifecycle test case, steps are executed one by one tightly. There exists the possibility that previous step is done but the exclusive lock is not released, and later step requires the exclusive lock, which will cause the error. If my guess is correct, then a possible temporary fix is sleeping for several seconds before running deprovisioning step.
IcM ticket submitted: https://icm.ad.msft.net/imp/v3/incidents/details/88035437/home
In latest merged PR for enhancement on storage module, all check in CI passes: https://github.com/Azure/open-service-broker-azure/pull/612. But when this PR merged into master, pipeline was triggered again, one lifecycle test case for cosmosdb failed, though this PR has no relationship with cosmosdb. See https://circleci.com/gh/Azure/open-service-broker-azure/5892?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link