citrix / terraform-provider-citrix

Terraform provider for Citrix
Apache License 2.0
41 stars 5 forks source link

Changing machine catalog master image or service account fails with 502 bad gateway error #39

Closed kgilbertwams closed 3 months ago

kgilbertwams commented 3 months ago

I worked around this by completely recreating the machine catalog. However, when I changed the service account username and tried to run a Terraform plan, I get a 500 Bad Gateway error. Details below.

Terraform will perform the following actions:

citrix_machine_catalog.default_catalog will be updated in-place ~ resource "citrix_machine_catalog" "default_catalog" { id = "" name = "" ~ provisioning_scheme = { ~ machine_domain_identity = { ~ service_account = "username1@domain.com" -> "username2@domain.com" ~ service_account_password = (sensitive value)

(2 unchanged attributes hidden)

        }
        # (7 unchanged attributes hidden)
    }
    # (7 unchanged attributes hidden)
}

Plan: 0 to add, 1 to change, 0 to destroy.

Error: Error updating Machine Catalog │ │ with citrix_machine_catalog.default_catalog, │ on machine_catalogs.tf line 1, in resource "citrix_machine_catalog" "default_catalog": │ 1: resource "citrix_machine_catalog" "default_catalog" { │ │ TransactionId: │ Error message: 502 Bad Gateway

xushengl commented 3 months ago

Hi @kgilbertwams ,

Thank you for reporting this issue. We tried to reproduce this issue and noticing that the 502 Bad Gateway error is usually thrown under the following 2 scenarios

  1. If you are using a Citrix Cloud customer, the connector could be in an unavailable state, which leads to the 502 Bad Gateway error. In this situation, please double check the Cloud Connector status before retrying the operation
  2. If you are using an OnPrem customer, the DDC could have network connectivity issue with the Active Directory Domain Controller.

Please let me know if either of the 2 scenarios applied. Thank you!

Xusheng "Fred" Liu

kgilbertwams commented 3 months ago

Hi Fred,

Thank you very much for your quick response.

We are using a Citrix Cloud customer. The connector was up and running, and I am able to change other settings without any issues. However, as soon as I try to change the service account username, I get the 502 Bad Gateway error.

Kevin

xushengl commented 3 months ago

Hi Kevin,

For the update on the machine catalog resource block, is machine_domain_identity the only change? For example, do you have any update on the number_of_total_machines field?

Xusheng "Fred" Liu

kgilbertwams commented 3 months ago

Hi Fred,

It's specifically the machine_domain_identity.service_account option. I can change another field, such as machine_domain_identity.domain_ou, and it works fine.

Kevin

xushengl commented 3 months ago

Hi Kevin,

We released a new version of the Citrix Terraform Provider recently, could you please try set the provider version to 0.5.2 and apply this change again?

required_providers {
    citrix = {
      source  = "citrix/citrix"
      version = "=0.5.2"
    }
  }

Xusheng "Fred" Liu

kgilbertwams commented 3 months ago

Tried it with an init, same issue.

Initializing provider plugins...
- Finding citrix/citrix versions matching "0.5.2"...
- Installing citrix/citrix v0.5.2...
- Installed citrix/citrix v0.5.2 (signed by a HashiCorp partner, key ID 25D62DD8407EA386)

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # citrix_machine_catalog.default_catalog will be updated in-place
  ~ resource "citrix_machine_catalog" "default_catalog" {
        id                  = "a6b1f755-4586-4e84-b6ff-96cb9f89d107"
        name                = "Azure West US - Server 2019 - D8s v4"
      ~ provisioning_scheme = {
          ~ machine_domain_identity        = {
              ~ service_account          = "svc_citrix@domain.com" -> "svc_citrixtest@domain.com"
                # (3 unchanged attributes hidden)
            }
            # (7 unchanged attributes hidden)
        }
        # (7 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.
citrix_machine_catalog.default_catalog: Modifying... [id=a6b1f755-4586-4e84-b6ff-96cb9f89d107]
citrix_machine_catalog.default_catalog: Still modifying... [id=a6b1f755-4586-4e84-b6ff-96cb9f89d107, 10s elapsed]
citrix_machine_catalog.default_catalog: Still modifying... [id=a6b1f755-4586-4e84-b6ff-96cb9f89d107, 20s elapsed]
citrix_machine_catalog.default_catalog: Still modifying... [id=a6b1f755-4586-4e84-b6ff-96cb9f89d107, 30s elapsed]
citrix_machine_catalog.default_catalog: Still modifying... [id=a6b1f755-4586-4e84-b6ff-96cb9f89d107, 40s elapsed]
╷
│ Error: Error updating Machine Catalog Azure West US - Server 2019 - D8s v4
│
│   with citrix_machine_catalog.default_catalog,
│   on machine_catalogs.tf line 1, in resource "citrix_machine_catalog" "default_catalog":
│    1: resource "citrix_machine_catalog" "default_catalog" {
│
│ TransactionId:
│ Error message: 502 Bad Gateway
xushengl commented 3 months ago

Hi Kevin,

Thank you for sharing the detailed error logs. We will keep you posted once we find the root cause of this error.

Best Regards, Xusheng "Fred" Liu

aneeshk-citrix commented 3 months ago

Hi @kgilbertwams,

Can you share with us your machine catalog terraform configuration block? Feel free to remove any sensitive data. My hunch is that the change in service account is forcing an update on one of the properties which is causing a timeout.

Thanks, Aneesh

zhuolun-citrix commented 3 months ago

Hi @kgilbertwams ,

As a quick test, can you try to update just the description of the catalog, without updating anything else, and see if you can repro this 502 error?

Thank you

kgilbertwams commented 3 months ago

Thank you both. Machine catalog block:

resource "citrix_machine_catalog" "default_catalog" {
  name              = var.daas_mcs_name
  description       = var.daas_mcs_description
  allocation_type   = var.daas_mcs_allocation_type
  session_support   = var.daas_mcs_session_support
  is_power_managed  = true
  is_remote_pc      = false
  provisioning_type = "MCS"
  zone              = citrix_zone.default_resource_location.id
  provisioning_scheme = {
    hypervisor               = citrix_azure_hypervisor.azure.id
    hypervisor_resource_pool = citrix_azure_hypervisor_resource_pool.azure_rp.id
    identity_type            = "ActiveDirectory"
    machine_domain_identity = {
      domain                   = var.daas_mcs_domain
      domain_ou                = var.daas_mcs_domain_ou
      service_account          = var.daas_mcs_service_account
      service_account_password = var.daas_mcs_service_account_password
    }
    azure_machine_config = {
      service_offering  = var.daas_mcs_service_offering
      resource_group    = var.daas_mcs_resource_group
      master_image      = var.daas_mcs_master_image
      storage_type      = var.daas_mcs_storage_type
      use_managed_disks = true
      machine_profile = {
        machine_profile_resource_group = var.daas_mcs_machine_profile_resource_group
        machine_profile_vm_name        = var.daas_mcs_machine_profile_vm_name
      }
      writeback_cache = {
        wbc_disk_storage_type          = "Premium_LRS"
        persist_wbc                    = true
        persist_os_disk                = false
        persist_vm                     = false
        writeback_cache_disk_size_gb   = 127
        writeback_cache_memory_size_mb = 4096
        storage_cost_saving            = true
      }
    }
    number_of_total_machines = var.daas_mcs_number_of_total_machines
    network_mapping = {
      network_device = "0"
      network        = var.daas_mcs_network
    }
    machine_account_creation_rules = {
      naming_scheme      = var.daas_mcs_naming_scheme
      naming_scheme_type = var.daas_mcs_naming_scheme_type
    }
  }
}

Test when I change the description of the catalog. I get the same error, so I guess it's not the only setting that this issue happens on.

Edit: Actually, I get the error shown below, but it did update the description of the catalog correctly. When I re-run the apply it shows no changes needed, and checking online shows the updated description. To double check myself, I tried this same thing with the username change. However, the setting does not update in the case of the username change.

Terraform will perform the following actions:

  # module.citrix_cloud_daas[0].citrix_machine_catalog.default_catalog will be updated in-place
  ~ resource "citrix_machine_catalog" "default_catalog" {
      ~ description         = "VDAs located in Azure West US. Server 2019, D8s v4 instances." -> "VDAs located in Azure West US. Server 2019, D8s v4 instances. Test"
        id                  = "691c0f0c-d8e2-4da3-8eab-f24e867988ca"
        name                = "Azure West US 3 - Server 2019 - D8s v4"
        # (7 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.
module.citrix_cloud_daas[0].citrix_machine_catalog.default_catalog: Modifying... [id=691c0f0c-d8e2-4da3-8eab-f24e867988ca]
module.citrix_cloud_daas[0].citrix_machine_catalog.default_catalog: Still modifying... [id=691c0f0c-d8e2-4da3-8eab-f24e867988ca, 10s elapsed]
module.citrix_cloud_daas[0].citrix_machine_catalog.default_catalog: Still modifying... [id=691c0f0c-d8e2-4da3-8eab-f24e867988ca, 20s elapsed]
module.citrix_cloud_daas[0].citrix_machine_catalog.default_catalog: Still modifying... [id=691c0f0c-d8e2-4da3-8eab-f24e867988ca, 30s elapsed]
╷
│ Error: Error updating Machine Catalog Azure West US 3 - Server 2019 - D8s v4
│
│   with module.citrix_cloud_daas[0].citrix_machine_catalog.default_catalog,
│   on .terraform\modules\citrix_cloud_daas\machine_catalogs.tf line 1, in resource "citrix_machine_catalog" "default_catalog":
│    1: resource "citrix_machine_catalog" "default_catalog" {
│
│ TransactionId:
│ Error message: 502 Bad Gateway
╵
kgilbertwams commented 3 months ago

FYI, I also get the same 502 error when I try to change the master image. I think this is a bigger deal than just changing the service account, so I'm updating the title of the issue to include this.

As a workaround, I was able to manually change the master image on the web interface, and then when I run plan after that it doesn't detect any issues. However, that is less ideal of course.

zhuolun-citrix commented 3 months ago

Hi @kgilbertwams ,

Thank you for running the tests. We are able to identify the issue which is a timeout issue related to using machine_profile in an Azure MCS catalog. Currently the only workaround is to recreate a catalog without using machine profiles. As long as there's no machine profile for the catalog, this issue will not happen.

We will mark this as a bug and will work on a fix in the next release.

Thank you again for bringing this issue to our attention!