Azure / Azure-Orbital-Analytics-Samples

Sample solution that demonstrates how to deploy and analyze spaceborne data using Azure Synapse Analytics
https://aka.ms/synapse-geospatial-analytics
MIT License
30 stars 24 forks source link

Setup.sh - Storage account name already taken prevents project setup #74

Closed ivo-andreev closed 2 years ago

ivo-andreev commented 2 years ago

When setting up the solution with setup.sh, there is an error for storage account name conflict. Error message included below.

{"status":"Failed","error":{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.","details":[{"code":"Conflict","message":"{\r\n \"status\": \"Failed\",\r\n \"error\": {\r\n \"code\": \"ResourceDeploymentFailure\",\r\n \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n \"details\": [\r\n {\r\n \"code\": \"DeploymentFailed\",\r\n \"message\": \"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.\",\r\n \"details\": [\r\n {\r\n \"code\": \"Conflict\",\r\n \"message\": \"{\r\n \\"status\\": \\"Failed\\",\r\n \\"error\\": {\r\n \\"code\\": \\"ResourceDeploymentFailure\\",\r\n \\"message\\": \\"The resource operation completed with terminal provisioning state 'Failed'.\\",\r\n \\"details\\": [\r\n {\r\n \\"code\\": \\"DeploymentFailed\\",\r\n \\"message\\": \\"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.\\",\r\n \\"details\\": [\r\n {\r\n \\"code\\": \\"Conflict\\",\r\n \\"message\\": \\"{\\r\\n \\\\"status\\\\": \\\\"Failed\\\\",\\r\\n \\\\"error\\\\": {\\r\\n \\\\"code\\\\": \\\\"ResourceDeploymentFailure\\\\",\\r\\n \\\\"message\\\\": \\\\"The resource operation completed with terminal provisioning state 'Failed'.\\\\",\\r\\n \\\\"details\\\\": [\\r\\n {\\r\\n \\\\"code\\\\": \\\\"DeploymentFailed\\\\",\\r\\n \\\\"message\\\\": \\\\"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.\\\\",\\r\\n \\\\"details\\\\": [\\r\\n {\\r\\n \\\\"code\\\\": \\\\"Conflict\\\\",\\r\\n \\\\"message\\\\": \\\\"{\\\\r\\\\n \\\\\\\\"error\\\\\\\\": {\\\\r\\\\n \\\\\\\\"code\\\\\\\\": \\\\\\\\"StorageAccountAlreadyTaken\\\\\\\\",\\\\r\\\\n \\\\\\\\"message\\\\\\\\": \\\\\\\\"The storage account named synhnsrqjhqd is already taken.\\\\\\\\"\\\\r\\\\n }\\\\r\\\\n}\\\\"\\r\\n }\\r\\n ]\\r\\n }\\r\\n ]\\r\\n }\\r\\n}\\"\r\n }\r\n ]\r\n }\r\n ]\r\n }\r\n}\"\r\n }\r\n ]\r\n }\r\n ]\r\n }\r\n}"}]}}

ivo-andreev commented 2 years ago

After some investigation of the .bicep files, found the following in data.bicep, which drives me think that the dataresourcegroup name shall be named in more unique manner

var dataResourceGroupNameVar = empty(dataResourceGroupName) ? '${namingPrefix}-rg' : dataResourceGroupName var nameSuffix = substring(uniqueString(dataResourceGroupNameVar), 0, 6) var keyvaultNameVar = empty(keyvaultName) ? '${namingPrefix}-kv' : keyvaultName var rawDataStorageAccountNameVar = empty(rawDataStorageAccountName) ? 'rawdata${nameSuffix}' : rawDataStorageAccountName

sjyang18 commented 2 years ago

I am curious about how 'synhnsrqjhqd' storage account is referenced, looking at your error. Did bicep create and reference it when you re-run? Do you see the storage account is in your data resource group? What resources do you see in the data resource group? what makes you think the resource group name is the root cause of this issue when it complains about the existing storage account name? We expect the first parameter to be unique so that it derives unique resource names.

Would you describe how we reproduce your issue from our end to help you? Thank you.

ivo-andreev commented 2 years ago

Check https://github.com/Azure/Azure-Orbital-Analytics-Samples/blob/main/deploy/README.md

The code above seems to use it to generate a hash on the first 6 characters of dataResourceGroupNameVar, so if it happens that I have selected a value which begins the same way as someone else's variable value, this would lead to that duplication conflict.

In main.bicep it seems to be initialized as: Line 50: var dataResourceGroupName = '${environmentCode}-${dataModulePrefix}-rg'

In my case I used "orb" as environmentCode. I afterwards changed it to "orbivo" - including my name in it gave me more uniqueness. In short - the generation of naming is not sufficiently unique and I spent about 1h reading through the bicep files.

The key here is that you assume certain resources would be globally unique, which with GA of Azure Orbital will lead to more and more naming conflicts.

ivo-andreev commented 2 years ago

Two more things (intentionally in a separate comment), but I believe the resolution approach could be similar.

1) All the instructions were to use WestUS region for Azure Orbital setup, so I did that here too. However I am based in Europe and my subscription was not allowed to create SQL Server resource in WestUS regions - probably legal stuff, I was not aware of, but a fact. That lead to additional deployment failures, but due to SQL server and not due to errors in the present project deployment approach. 2) The failure above made me try another region and it worked, but failed the first time due to Synapse deployment duplication. As I left all other parameters the same, except for the region, the synapse deployment failed as the deployment name was duplicate. In both cases above some suffix generated from the current date should be helpful.

Suggestion - use utcNow function to generate a suffix and help more unique naming. Challenge - this may severely affect the cleanup as some of the names will not be predictable and the user would have to provide an extra parameter from which the suffix could be extracted

My expectations are that the issue above would be encountered by people predictably setting environmentcode to things like "orbit" or "orbital"

sjyang18 commented 2 years ago

We don't know what resource quota and resource regional constraint customer's subscription/product would have and thus we expect customers to choose the different regions accordingly. The instructions we showed in Readme are examples.

Using utcNow function to create a resource group & resource names is not good idea, either. This will break the idempotency requirement. Rerunning the template with the same parameters would result in creating a new resource groups and resources instead of updating the existing infra.

Going back to my original question, you pass orb initially and you get 'synhnsrqjhqd' as a storage account name, and it failed?

sjyang18 commented 2 years ago

I see synhns storage account in -pipeline-rg. It was not from *-data-rg.

ivo-andreev commented 2 years ago

Going back to my original question, you pass orb initially and you get 'synhnsrqjhqd' as a storage account name, and it failed? I believe exactly that was the cause and renaming from "orb" to "orbivo" was enough to solve it, although (as said before) I had to read through a bunch of bicep files to find what to do.

As said above, the example with another reason for failure was just how I got another name duplication - namely in synapse (and the idempotency did not help there at all)

sjyang18 commented 2 years ago

We expect the first parameter (evcode) to be unique in the subscription. If you want to use the same evcode in another region, drop the existing environment with cleanup.sh and rerun in the new region.

sjyang18 commented 2 years ago

You may try in pipeline.bicep:

var synapseHnsStorageAccountNameVar = empty(synapseHnsStorageAccountName) ? 'synhns${substring(uniqueString('${synapseResourceGroupNameVar}${location}'), 0, 16)}' : synapseHnsStorageAccountName
ivo-andreev commented 2 years ago

I may not have been completely clear. Finally, after 4h, my deployment went through. I still do not know whether everything works but my point was to share experience if you find any of that input useful. For now I am afraid of running another deployment in order not to break something. My next aim is to see that in action, finally

As I see you have not pushed changes to pipeline.bicep yet

ivo-andreev commented 2 years ago

We expect the first parameter (evcode) to be unique in the subscription. If you want to use the same evcode in another region, drop the existing environment with cleanup.sh and rerun in the new region.

It did not come to my mind to cleanup to solve the Synapse duplicate name issue. As mentioned above, the bad thing is that I won't try that now after the deployment went through

Please note: envcode was unique in the subscription, but it seems it has a global effect (as it failed form the first try using "orb")

sjyang18 commented 2 years ago

Let's go over what happended to reproduce your issue.

You run

setup.sh orb westus orbital custom-vision-model-v2

And, encountered problems with sql server resources.

And, you picked another region westus2 and use the same evcode.

setup.sh orb westus2 orbital custom-vision-model-v2

And, this time we have problem with a duplicate storage account name.

And, now you rerun the command with a new evcode and went thru.

setup.sh orbivo westus2 orbital custom-vision-model-v2

I just want to understand and confirm why we have the duplicate storage account name issue.

ivo-andreev commented 2 years ago

Here is it once again:

  1. The first problem was with the "orb" - duplicate name (see the first post on the issue)
  2. I deleted all resource groups with the prefix
  3. I renamed orb to orbivo - SQL deployment failure due to region (no more duplicate name of the storage account)
  4. I deleted all resource groups with the prefix
  5. I changed the region from westus to westeurope - synapse duplicate name error (no more SQL deployment problems, but synapse already had a global deployment with that name)
  6. I deleted all resource groups with the prefix
  7. I changed orbivo to orbivowe - the deployment went through (no longer synapse name conflicts)
sjyang18 commented 2 years ago

When you delete resource groups, did you use cleanup.sh? From my experience, if there is a deployment failure due to regional resource restriction, you run cleanup.sh and run in another region with a new evcode. I usually add/increment number to evcode. This is because even if you delete resources, Azure still needs some time (more than 1 hour) to clean up the resource completely. Hope this tip helps you understand why we have a storage account name/synapse name error.

ivo-andreev commented 2 years ago

Hope this tip helps you understand why we have a storage account name/synapse name error. That may be valid for Synapse, but not valid for the storage account.

The storage account issue was the first one I ran and there was nothing to cleanup. I just hit a valid name generated for another user. Regardless whether I cleanup or not another user is highly likely to have that name already used. The chance is getting higher with more users giving orbital analytics a try (hopefully that is among the objectives). (It seems you have addressed that in the proposed bicep change, but not merged to the branch)

I did not use cleanup, which of course could cause the secondary issue with synapse. Frankly, when looking in the resource groups, I did not have any workspace in synapse analytics with the name above, so I do not have an idea how a conflict could appear, but let us assume an incomplete deployment may have lead to something, which was not observable in the portal. I deleted all resource groups and did not wait a lot, but as you say for different services the behaviour may be different and may be 1h may be necessary to Synapse. My particular errors are below:

{"code": "InvalidDeploymentLocation", "message": "Invalid deployment location 'westeurope'. The deployment 'SYNAPSE-ORBIVO-DEPLOY' already exists in location 'westus2'."}

{"code": "InvalidDeploymentLocation", "message": "Invalid deployment location 'westus'. The deployment 'SYNAPSE-ORBIVO-DEPLOY' already exists in location 'westus2'."}

At the end, hope that thread helps someone who may encounter a similar issue.

sjyang18 commented 2 years ago

I am working on your input and going thru regression. The plan is to use this namesuffix:

var nameSuffix = substring(uniqueString(guid('${subscription().subscriptionId}${namingPrefix}${environmentTag}${location}')), 0, 10)
ivo-andreev commented 2 years ago

I am working on your input and going thru regression.

Thanks and appreciated :) I am about to continue with the next steps and really hope it would go smooth

sjyang18 commented 2 years ago

@ivo-andreev Would your review the PR and vote? Thanks

ivo-andreev commented 2 years ago

I reviewed the comments. Do I need to close unresolved conversations as I do not seem to be able to?