microsoft / AzureTRE

An accelerator to help organizations build Trusted Research Environments on Azure.
https://microsoft.github.io/AzureTRE
MIT License
185 stars 145 forks source link

Workspace creation blocked due to Azure API depreciation. #4095

Closed TonyWildish-BH closed 2 months ago

TonyWildish-BH commented 2 months ago

Describe the bug I created a workspace for a user on September 10th. The user was unable to allocate time to use it until this week, so I disabled the workspace on the 10th, to reduce costs a bit.

Today, I tried to re-enable that workspace, and it fails. The message in the Operations tab is from Terraform, about being unable to register a Resource Provider:

Error message: Error ensuring Resource Providers are registered.
Terraform automatically attempts to register the Resource Providers it supports to
ensure it's able to provision resources.
If you don't have permission to register Resource Providers you may wish to use the
"skip_provider_registration" flag in the Provider block to disable this functionality.

Please note that if you opt out of Resource Provider Registration and Terraform tries
to provision a resource from a Resource Provider which is unregistered, then the errors
may appear misleading - for example:

> API version 2019-XX-XX was not found for Microsoft.Foo

Could indicate either that the Resource Provider "Microsoft.Foo" requires registration,
but this could also indicate that this Azure Region doesn't support this API version.

More information on the "skip_provider_registration" flag can be found here
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs#skip_provider_registration

Original Error: determining which Required Resource Providers require registration: the required Resource Provider "Microsoft.TimeSeriesInsights" wasn't returned from the Azure API

with provider["registry.terraform.io/hashicorp/azurerm"],
on providers.tf line 20, in provider "azurerm":
20: provider "azurerm"

error running command /cnab/app/terraform /usr/bin/terraform apply -auto-approve -input=false -var

Steps to reproduce

  1. Create a workspace
  2. Disable it
  3. Wait two weeks (not sure how important that part is)
  4. (Try to) re-enable the workspace

Azure TRE release version (e.g. v0.14.0 or main): main, as of August 12th.

Deployed Azure TRE components - click the (i) in the UI: UI Version: 0.5.27 API Version: 0.18.11

TonyWildish-BH commented 2 months ago

in fact, it's worse than that. I just tried creating a new workspace, and I get the same issue. If anyone's seen this before and knows what to do, I'd appreciate some help.

marrobi commented 2 months ago

Can you check that the provider Microsoft.TimeSeriesInsights is registered in your subscription, and which API versions are available?

https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/resource-providers-and-types#register-resource-provider

TonyWildish-BH commented 2 months ago

Microsoft.TimeSeriesInsights is not visible in the list of resource providers, whether registered or not.

marrobi commented 2 months ago

See: https://github.com/hashicorp/terraform-provider-azurerm/issues/27466

Can you try updating the provider version please.

jonnyry commented 2 months ago

Hi @TonyWildish-BH @marrobi

Yes I've just seen this when creating a new workspace

RPal111 commented 2 months ago

Hi @TonyWildish-BH

The error means that Terraform is trying to use a resource provider called Microsoft.TimeSeriesInsights, but it isn't registered in your Azure subscription. Terraform usually registers resource providers automatically, but sometimes this fails because of permission issues or other problems.

See this ; https://learn.microsoft.com/en-us/azure/azure-resource-manager/troubleshooting/error-register-resource-provider

Danny-Cooke-CK commented 2 months ago

Hi tony. We've also had the same issue today. We tried to deploy a new TRE and the pipeline failed with the same issue. I have raised a ticket with MS as the provider has disappeared and i can't reenable it

│ Error: Error ensuring Resource Providers are registered. │ │ Terraform automatically attempts to register the Resource Providers it supports to │ ensure it's able to provision resources. │ │ If you don't have permission to register Resource Providers you may wish to use the │ "skip_provider_registration" flag in the Provider block to disable this functionality. │ │ Please note that if you opt out of Resource Provider Registration and Terraform tries │ to provision a resource from a Resource Provider which is unregistered, then the errors │ may appear misleading - for example: │ │ > API version 2019-XX-XX was not found for Microsoft.Foo │ │ Could indicate either that the Resource Provider "Microsoft.Foo" requires registration, │ but this could also indicate that this Azure Region doesn't support this API version. │ │ More information on the "skip_provider_registration" flag can be found here: │ https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs#skip_provider_registration │ │ Original Error: determining which Required Resource Providers require registration: the required Resource Provider "Microsoft.TimeSeriesInsights" wasn't returned from the Azure API │ │ with provider["registry.terraform.io/hashicorp/azurerm"], │ on main.tf line 33, in provider "azurerm": │ 33: provider "azurerm" │ ╵ Releasing state lock. This may take a few moments... make: [/home/vscode/AzureTRE/Makefile:110: deploy-core] Error 1

Danny-Cooke-CK commented 2 months ago

https://azure.microsoft.com/en-us/updates/we-re-retiring-azure-time-series-insights-on-7-july-2024-transition-to-azure-data-explorer/#:~:text=Microsoft%27s%20leadership%20emphasizing%20priority%20on%20security%20has%20decided,Insights%20will%20stop%20functioning%20and%20will%20be%20inaccessible

jonnyry commented 2 months ago

Upping the azurerm provider version in the base workspace appears to work (as evidenced by other people in https://github.com/hashicorp/terraform-provider-azurerm/issues/27466). See fix here: https://github.com/nwsde/nwsde-azuretre/tree/jr/upstream-main/65-update-tf-providers

marrobi commented 2 months ago

@jonnyry the OpenAI PR requires a core update. So make deploy-core at minimum.

jonnyry commented 2 months ago

@jonnyry the OpenAI PR requires a core update. So make deploy-core at minimum.

Ah yes - I've removed that from my comment as I realised I was deploying the latest base workspace against a slightly older TRE. Just re-testing now.

JaimieWi commented 2 months ago

Could this change be applied across all bundles with the older versions? The error is also seen on user resource and workspace service deployments.

Tests done: Deploying Guacamole workspace service or any new VMs fails with the same error.

TonyWildish-BH commented 2 months ago

there's quite a range of values specified in various places:

> grep --after-context=2 -h 'azurerm =' $(find core templates -name providers.tf) | grep version | awk -F\" '{ p
rint $2 }' | sort | uniq
3.37.0
=3.108.0
=3.37.0
=3.40.0
=3.53.0
=3.57.0
=3.58.0
=3.73.0
>= 3.8.0
>=3.33.0
jonnyry commented 2 months ago

Also the PR above (https://github.com/microsoft/AzureTRE/pull/4096) is to the base workspace only.

airlock-import-review and unrestricted workspaces both pull a pinnned version of the base workspace code from GitHub and so will use an older version, until a new Azure TRE release is created, and the version number is updated here:

https://github.com/microsoft/AzureTRE/blob/7dd1915ca4a350ebc6c34282078965a127d058b0/templates/workspaces/airlock-import-review/Dockerfile.tmpl#L12

https://github.com/microsoft/AzureTRE/blob/7dd1915ca4a350ebc6c34282078965a127d058b0/templates/workspaces/unrestricted/Dockerfile.tmpl#L12

TonyWildish-BH commented 2 months ago

I'm still getting an error message after bumping the provider to 3.108.0 as suggested. I can confirm that the template was rebuilt, and that I'm using the new template. Then I get this...

Error message: 
parsing "/subscriptions/*******/resourceGroups/rg-XXXX-ws-XXXX/providers/microsoft.insights/components/appi-sdebeta-ws-0a84": parsing segment "staticMicrosoftInsights": parsing the Component ID: the segment at position 5 didn't match 
Expected a Component ID that matched: 
> /subscriptions/12345678-1234-9876-4563-123456789012/resourceGroups/example-resource-group/providers/Microsoft.Insights/components/componentValue 
However this value was provided: 
> /subscriptions/*******/resourceGroups/rg-XXXX-ws-XXXX/providers/microsoft.insights/components/appi-sdebeta-ws-0a84 
The parsed Resource ID was missing a value for the segment at position 5 
(which should be the name of the Resource Provider [for example 'Microsoft.Insights']

AFAICT, the provided value matches the expected expression, so I don't see what the problem is.

jonnyry commented 2 months ago

@tim-allen-ck #4096 fixes the Terraform issue for the base workspace.

But as @JaimieWi mentions above the issue exists on other templates since each pins its own version of the azurerm providier, e.g. Guacamole:

image

Danny-Cooke-CK commented 2 months ago

Regarding this case. All the providers need updating and then the lock files where appropriate. Working on it now as a priority

TonyWildish-BH commented 2 months ago

@Danny-Cooke-CK, thanks for prioritising this. When you test the fix, will you be testing only fresh installations, or will you test upgrades to existing installations?

The new error message I'm getting probably won't show up on a fresh install, but we need to upgrade. It seems from Hashicorp that the problem is linked to strict case-sensitive matching where there used to be case insensitive matching.

marrobi commented 2 months ago

@TonyWildish-BH the case issues occurs on fresh installs too, @jonnyry accounted for it in his PR to fix workspaces - https://github.com/nwsde/nwsde-azuretre/blob/e4d9a07a040079639903baeee03b7aadf9283c36/templates/workspaces/base/terraform/azure-monitor/azure-monitor.tf#L139

TonyWildish-BH commented 2 months ago

thanks @marrobi, I'd missed that line, and that does fix it for me.

jonnyry commented 2 months ago

Regarding this case. All the providers need updating and then the lock files where appropriate. Working on it now as a priority

@Danny-Cooke-CK thanks. Also note the unrestricted and airlock-import-review workspace templates also have the same problem, but the fix is different - see https://github.com/microsoft/AzureTRE/issues/4095#issuecomment-2371464915 above.

Danny-Cooke-CK commented 2 months ago

Regarding this case. All the providers need updating and then the lock files where appropriate. Working on it now as a priority

@Danny-Cooke-CK thanks. Also note the unrestricted and airlock-import-review workspace templates also have the same problem, but the fix is different - see #4095 (comment) above.

Thanks @jonnyry i was just about to create a PR and saw this comment !! cheers i'll make that update too

Danny-Cooke-CK commented 2 months ago

PR to fix this raised https://github.com/microsoft/AzureTRE/pull/4097

TonyWildish-BH commented 2 months ago

@Danny-Cooke-CK ,thanks for the fast work.

I see the PR completely removes config.sample.yaml, is that intentional?

Screenshot 2024-09-26 at 8 43 49 AM
Danny-Cooke-CK commented 2 months ago

@Danny-Cooke-CK ,thanks for the fast work.

I see the PR completely removes config.sample.yaml, is that intentional?

Screenshot 2024-09-26 at 8 43 49 AM

thanks tony. well spotted

marrobi commented 2 months ago

@TonyWildish-BH you seem to have linked a PR to close this issue from your fork of the project?

TonyWildish-BH commented 2 months ago

@Danny-Cooke-CK, just a question since you're on it:

I see that the version specs still vary: 3.112.0, =3.112.0, >=3.112.0. Is it feasible to have them all identical, or will that break something, now or in the future? It would be nice for them to be consistent if that's possible.

TonyWildish-BH commented 2 months ago

@TonyWildish-BH you seem to have linked a PR to close this issue from your fork of the project?

not intentionally, that's just a partial fix for the bits I care about right now. Updated to remove the link, thanks for calling that out.