microsoft / terraform-provider-azuredevops

Terraform Azure DevOps provider
https://www.terraform.io/docs/providers/azuredevops/
MIT License
379 stars 271 forks source link

Creating azuredevops_group.ad_group results in HTTP 503 #298

Open jamescross91 opened 3 years ago

jamescross91 commented 3 years ago

Community Note

Terraform (and Azure DevOps Provider) Version

Terraform 0.13.5, Azure Devops Provider Version 0.1.2

Affected Resource(s)

azuredevops_group

Terraform Configuration Files

terraform {
  required_providers {
    azuredevops = {
      source = "microsoft/azuredevops"
      version = "0.1.2"
    }
  }
}

###########################################################
# Azure DevOps Permissions
###########################################################
data "azuredevops_group" "admin_group" {
  project_id = azuredevops_project.project.id
  name       = "Project Administrators"
}

resource "azuredevops_group" "ad_group" {
  origin_id  = var.analytics_ad_group_id
}

resource "azuredevops_group_membership" "admin_group" {
  group = data.azuredevops_group.admin_group.descriptor
  members = [
    azuredevops_group.ad_group.descriptor
  ]
}

Expected Behavior

Resource is created

Actual Behavior

Running terraform apply tfplan 2021-02-09 16:43:47,034 - [INFO] - module.devops.azuredevops_group.ad_group: Creating... 2021-02-09 16:43:51,967 - [INFO] - 2021-02-09 16:43:51,967 - [INFO] - Error: REST call returned status code 503

Note this seems to work through the DevOps UI

xuzhang3 commented 3 years ago

Hi @jamescross91 I cannot reproduce your error. Is origin_id = var.analytics_ad_group_id the real AzureRM ADD group object ID?

w0ut0 commented 3 years ago

related to 382?

mariussm commented 3 years ago

Same issue

jacky-ni commented 2 years ago

Since the beginning of this week, for some reason version 0.1.3 also gives the 503 error, I also tried some other versions, the latest 0.1.7 and 0.1.5, but none of them is working. Anyone experiencing the same issue?

xuzhang3 commented 2 years ago

@jacky-ni Can you share your TF script?

jacky-ni commented 2 years ago

@xuzhang3 suire, thsi is the what I have in the provider.tf:

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 2.48"
    }
    azuredevops = {
      source  = "microsoft/azuredevops"
      version = "0.1.3"
    }
  }
  backend "azurerm" {
    subscription_id      = "xxx"
    resource_group_name  = "xxx"
    storage_account_name = "xxx"
    container_name       = "xxx"
    key                  = "terraform.state"
  }
}

provider "azurerm" {
  alias           = "xxxx"
  subscription_id = "xxxxx"
  features {}
}
xuzhang3 commented 2 years ago

@jacky-ni Assume you are trying to create a group based on the AAD group and got a 503 error. How does this group refer to the AAD group? For example, the azuredevops_group.ad_group is reference an AAD group var.analytics_ad_group_id :


data "azuredevops_group" "admin_group" {
  project_id = azuredevops_project.project.id
  name       = "Project Administrators"
}

resource "azuredevops_group" "ad_group" {
  origin_id  = var.analytics_ad_group_id
}
``
jacky-ni commented 2 years ago

@xuzhang3 Thanks for your reply.

Yes this is what we have

# An existing azure ADO group
data "azuredevops_group" "azure_devops_groupname" {
  project_id = azuredevops_project_id
  name       = "ado_group_name"
}

# An existing azure AD group
data "azuread_group" "aad_group" {
  display_name = "azure_ad_group_name"
}

resource "azuredevops_group" "azdo_linked_to_aad" {
  origin_id = data.azuread_group.aad_group.object_id
}

resource "azuredevops_group_membership" "group_membership" {
  group = data.azuredevops_group.azure_devops_groupname.descriptor
  members = [
    azuredevops_group.azdo_linked_to_aad.descriptor
  ]
}
xuzhang3 commented 2 years ago

@jacky-ni I cannot reproduce your error. Have change the PAT permissions recently?

jamescross91 commented 2 years ago

The issue in our case was because the service principle running the deployment didn't have sufficient permissions on active directory - so this is a case of improving logging since a 503 indicates a server side error

xuzhang3 commented 2 years ago

@jamescross91 #381 is tracking the logging issues.

jacky-ni commented 2 years ago

@xuzhang3 No, I just checked the PAT, it is still valid, and we didn't touch it. And the interesting thing is that, it starts working again.... although we didn't change anything

nikydobrev commented 2 years ago

Same issue on our side. We use several "azuredevops_group" resources, and it works fine for some of them, however, it fails only on one specific. The principal we use has all the necessary permissions over the AAD.

Terraform Code:

resource "azuredevops_group" "azuredevops_group_onduty_engineers" {
  origin_id = data.azuread_group.aad_group_on_duty_engineers.id
}

resource "azuredevops_group" "azuredevops_group_platform_engineers" {
  origin_id = data.azuread_group.aad_group_platform_engineers.id
}

Terraform Output:

........

azuredevops_group.azuredevops_group_platform_engineers: Creating...
azuredevops_iteration_permissions.iteration_permissions_root: Destruction complete after 7s
azuredevops_area_permissions.area_permissions_root: Destruction complete after 7s
azuredevops_group.azuredevops_group_manage_boards: Creating...
azuredevops_build_definition_permissions.build_definition_permissions_runtime_upgrade: Destruction complete after 7s
azuredevops_build_definition_permissions.build_definition_permissions_application_restart: Destruction complete after 7s
azuredevops_group.azuredevops_group_platform_engineers: Creation complete after 1s [id=aadgp.Uy0xLTktMTU1MTM3NDI0NS0xMjA0NDAwOTY5LTI0MDI5ODY0MTMtMjE3OTQwODYxNi0zLTQyNTkwMDg2OTMtMjY2MTIyMzc1My0yNTY1MzMzMDc0LTI2NzM4OTk1MDI]

........

##[error]Terraform command 'apply' failed with exit code '1'.
##[error]╷
│ Error: REST call returned status code 503
│ 
│   with azuredevops_group.azuredevops_group_onduty_engineers,
│   on main.tf line 1312, in resource "azuredevops_group" "azuredevops_group_onduty_engineers":
│ 1312: resource "azuredevops_group" "azuredevops_group_onduty_engineers" {
│ 
╵
pondichys commented 2 years ago

Greetings everyone, we also experienced this issue on our side.

We created a module that creates an Azure DevOps group from an Azure AD security group using the objectid and origin_id attribute. From time to time, the simple creation of the group fails with an error 503. Creating the same group via the portal or az cli does work correctly.

Terraform module code :

resource "azuredevops_group" "azdo_group" {
  origin_id = var.aad_group_object_id
}

resource "azuredevops_group_membership" "azdo_group_membership" {
  group = data.azuredevops_group.project_contributors.descriptor
  members = [
    azuredevops_group.azdo_group.descriptor
  ]
}

One thing worth to mention is that the Azure AD group is created a few minutes before by another Terraform module.

maxvandermeij commented 2 years ago

Still having the same issue described above with the Azure DevOps 0.2.0 provider. We do create new AAD groups just before trying to create the Azure DevOps group resource representation. A simplified version of our code without the loops:

resource "azuredevops_project" "teams" {
  name                         = "something"
  visibility                     = "private"
  work_item_template = "Agile"
  version_control         = "Git"

  features = {
    "repositories" = "enabled"
    "pipelines"    = "enabled"
    "artifacts"    = "enabled"
    "boards"       = "enabled"
    "testplans"    = "disabled"
  }
}

resource "azuread_group" "teams" {
  display_name     = "display_name"
  security_enabled = true

  members = [var.members]
}

data "azuredevops_group" "project-default" {
  project_id = azuredevops_project.teams.id
  name       = "${azuredevops_project.teams.name} Team"
}

resource "azuredevops_project" "teams {
  origin_id = azuread_group.teams.object_id
}

resource "azuredevops_group" "aad-group" {
  origin_id = azuread_group.teams.object_id
}

resource "azuredevops_group_membership" "project-default" {
  group = data.azuredevops_group.project-default.descriptor
  mode  = "add"
  members = [
    azuredevops_group.aad-group.descriptor
  ]
}

When creating a new project and new group this only sometimes result in the 503 error and sometimes a 400 errorcode on creating the azuredevops_group.aad-group. There are also times where there are no reported errors and everything is created according to the plan. For the 400 error we can just rerun the code and it will solve itself as the provider at a later point is able to create the Azure DevOPs group from the AAD object. Perhaps a timing issue with the AAD graph object availability? (Although ugly adding a sleep of 30 seconds seem to solve the 400 error issue. For the times we get the 503, we usualy have to import the azuredevops_group resource using its descriptor manually, as it usually IS already available in the Azure DevOps organization already but somehow this was not registered in Terraform.

Edit: Perhaps this one other process which also "processes" the new Azure AD Groups on a periodic basis which could explain the mixed results that we are having. We do also use "Group Rules" to give all users of our organization licenses. We do this by combining all team's AAD groups to a single AAD Group used to couple the group rules to. Does the group rules syncing also manage azuredevops_group objects for all AAD objects within the group rules AAD group?

musukvl commented 1 year ago

Spent a few days on the same issue. In my case, the real 503 meaning was: "PAT token user had not enough permissions". For some reason organization admin user was replaced, so old user (who PAT I used) had permission to create projects, but not to list groups from external AD. My case example was:

resource "azuredevops_group" "azdo_group_linked_to_aad" {
  for_each  = toset(local.all_security_groups)
  origin_id = each.value
}
flostelzer commented 1 year ago

As mentioned in the following link, the failure can also be caused by the conditional access policies of your tenant: https://learn.microsoft.com/en-us/azure/devops/organizations/accounts/change-application-access-policies?view=azure-devops#conditional-access-policies For third-party client flows, like using a PAT with git.exe, we support IP fencing policies only. Users may find that sign-in policy may be enforced for PATs as well. Using PATs to make Azure AD calls requires the user to adhere to any sign-in policies that are set. For example, if a sign-in policy requires that a user sign in every seven days, you must also sign in every seven days if you wish to continue using PATs to make requests to Azure AD.

This was the problem in our case and we have solved it by re-sign in with the user, which owns the PAT we are using in our pipelines.

meizenga commented 1 year ago

I got the HTTP 503 error fixed by updating to v0.5.0 Update: still works in v0.6.0

karishma-kohli commented 1 year ago

Hello Everyone, I am facing a similar issue, the only difference is that I am getting the error 500. Here is the code -

resource "azuredevops_group" "azdo_read_group_linked_to_aad" { description = "readonly role group" origin_id = "azuread group object id" }

resource "azuredevops_group" "azdo_contributors_group_linked_to_aad" { description = "dev role group" origin_id = "azuread group object id" }

resource "azuredevops_group" "azdo_tlead_group_linked_to_aad" { description = "team lead role group" origin_id = "azuread group object id" }

The group object ids are being printed in the output as follows so the AD groups are getting created successfully:

group_object_ids = { dev = "32d1f7d9-8dbb-4931-9ada-c4a6575ea168" readonly = "38aad7a5-6aec-4a73-be30-e1e3dc9da0e1" tlead = "6edf894e-076c-4fe4-9817-dbe3a36e8cd6"

The error is as follows:

╷ │ Error: REST call returned status code 500 │ │ with module.onboard_ado_project["third-tfe-project"].azuredevops_group.azdo_read_group_linked_to_aad, │ on module/ado/main.tf line 111, in resource "azuredevops_group" "azdo_read_group_linked_to_aad":
│ ╵ ╷ │ Error: REST call returned status code 500 │ │ with module.onboard_ado_project["third-tfe-project"].azuredevops_group.azdo_contributors_group_linked_to_aad, │ on module/ado/main.tf line 116, in resource "azuredevops_group" "azdo_contributors_group_linked_to_aad": │ 116: resource "azuredevops_group" "azdo_contributors_group_linked_to_aad" { │ ╵ ╷ │ Error: REST call returned status code 500 │ │ with module.onboard_ado_project["third-tfe-project"].azuredevops_group.azdo_tlead_group_linked_to_aad, │ on module/ado/main.tf line 121, in resource "azuredevops_group" "azdo_tlead_group_linked_to_aad":
│ 121: resource "azuredevops_group" "azdo_tlead_group_linked_to_aad" { │ ╵

The same code works well in another sample private org that I created for testing. Both the orgs (private and public) point to the same AD tenant.

Really need some help here to know what's wrong.

xuzhang3 commented 1 year ago

@karishma-kohli This might be caused by PAT permissions.

karishma-kohli commented 1 year ago

@xuzhang3 the PAT has full access on the AzDO Organization. What other permission could be missing?

xuzhang3 commented 1 year ago

@karishma-kohli PAT is used to manage the AzDO resources. You also manage the AD groups, you may need to check the user permissions in the Azure Sub.