citrix / terraform-provider-citrix

Terraform Provider for Citrix
https://registry.terraform.io/providers/citrix/citrix/latest
Apache License 2.0
46 stars 6 forks source link

[Bug] Error during Hypervisor Resource Pool creation #126

Open Xufuru opened 1 month ago

Xufuru commented 1 month ago

In the new version of the provider (1.0.3), as it was in the old version (1.0.1, 1.0.1-bugfix-1 and 1.0.1-bugfix-2 ), I'm having trouble creating the Hypervisor Resource Pool. The error appened random when i launch the code (by random i means, if i run the code 5 times, the error could appear in the first run and after disappear, and could return in the fourth run), this is the error:

��� Error: Error creating Hypervisor Resource Pool for Azure ��� ��� with module.vdr["VDR-TEST1"].module.citrix.citrix_azure_hypervisor_resource_pool.azure-hypervisor-resource-pool, ��� on modules/citrix/main.tf line 13, in resource "citrix_azure_hypervisor_resource_pool" "azure-hypervisor-resource-pool": ��� 13: resource "citrix_azure_hypervisor_resource_pool" "azure-hypervisor-resource-pool" { ��� ��� TransactionId: 78a0e868-dde7-4e6c-aca7-05c858822481 ��� Failed to resolve virtual network vnt-itn-tecp-001-VDR-TEST1 in region ��� Italy North, error: could not find resource

This is my terraform code for the resource:

resource "citrix_azure_hypervisor_resource_pool" "azure-hypervisor-resource-pool" { name = upper("RP-AZ-NI-TECS-P-${var.vdr_name}") hypervisor = local.azure_hypervisor_id region = var.location virtual_network_resource_group = var.resource_group_vdr virtual_network = var.virtual_network subnets = [var.subnet_vdi] }

For more clearnce, the vnet specified in the error exist and i can see it in my azure portal.

zhuolun-citrix commented 1 month ago

Hi @Xufuru ,

Can you consistently reproduce this issue?

Xufuru commented 1 month ago

Yes, it happens, at least, every time i launch my code in a new workspace of terraform and randomly if i delete the resources using terraform-destroy and after that using terraform plan and apply.

zhuolun-citrix commented 1 month ago

OK so when you run it for the first time in a new terraform workspace, it will consistently fail?

Xufuru commented 1 month ago

From the tests done so far, it would seem so.

zhuolun-citrix commented 1 month ago

Is it possible that you share the total time for the first run on a fresh terraform workspace? You can check terraform output for the [{time} elapsed] to get the total time that the first apply took if it fails.

zhuolun-citrix commented 1 month ago

Hi @Xufuru ,

Any update on this?

zhuolun-citrix commented 1 month ago

Hi @Xufuru ,

Any update on this? Are you able to get the total time for the first apply that consistently fails to locate the resource?

Xufuru commented 1 month ago

In the first run and apply, when the code arrives at the resource pool resource it immediately fails

zhuolun-citrix commented 3 weeks ago

Hi @Xufuru ,

We have released v1.0.6 last week. Can you please try out the new version and let us know if the issue persist?

Unfortunately we cannot reproduce this issue on our environment. If the issue persist we might need to work more closely to get some detailed traces for further troubleshooting.

Thank you, Zhuolun

j7lloyd commented 2 weeks ago

@zhuolun-citrix, I've observed a similar pattern, and it seems to occur the first time a new resource is created in Azure. For me, this has happened with both an Azure Image Gallery image and a Disk Encryption Set. Specifically, I get an error stating it "Failed to locate Azure Image Gallery image of version in gallery ," indicating the resource could not be found. However, if I re-run terraform apply a second (or subsequent) time, it completes successfully. It seems like the initial apply triggers some sort of discovery process, but the resource isn't immediately available, which might explain why it succeeds on later attempts.

Xufuru commented 2 weeks ago

Exactly, this is precisely the behavior we are observing in our case, and the issue is just as you explained. As for updating to the new version 1.0.6, I have done that, but I haven't had the chance to test the new version yet.

j7lloyd commented 2 weeks ago

@Xufuru, I actually encountered this issue with version 1.0.6 just this morning, which reminded me to follow up. After looking into it, I found your existing issue and decided to add my input here.

Xufuru commented 2 weeks ago

Thank you for the message. However, we need your support to resolve this issue as soon as possible, as it is having a significant negative impact on the solution we are using with a client.

Thank you nonetheless for the support and the great work you are doing.

aneeshk-citrix commented 2 weeks ago

Hi @j7lloyd, @Xufuru,

We hear you and we will work on a fix. Can you tell us if the names of the resources (including those that lead to the resource in question; for example - resource group, gallery name, definition) match the casing in Azure? I wonder if the search is doing a case-sensitive comparison and would like to rule out this possibility. We are trying to reproduce this issue on our end and trying to gather as much info as we can.

@j7lloyd Can you share the error message and/or the transaction ID?

Thanks, Aneesh

Xufuru commented 2 weeks ago

For our case is the virtual network in Italy north region, i will write down the entire error:

��� Error: Error creating Hypervisor Resource Pool for Azure ��� ��� with module.vdr["VDRTEST112"].module.citrix.citrix_azure_hypervisor_resource_pool.azure-hypervisor-resource-pool, ��� on modules/citrix/main.tf line 13, in resource "citrix_azure_hypervisor_resource_pool" "azure-hypervisor-resource-pool": ��� 13: resource "citrix_azure_hypervisor_resource_pool" "azure-hypervisor-resource-pool" { ��� ��� TransactionId: 81014ed1-d739-4e4d-becc-69b17b1b5727 ��� Failed to resolve virtual network vnt-itn-tecp-001-VDRTEST112 in region ��� Italy North, error: could not find resource

j7lloyd commented 2 weeks ago

@aneeshk-citrix, while the case may be different, since the issue resolves itself on the second (or subsequent) attempt without any changes, I don't see how this could be a contributing factor. As for the exact error message and transaction ID, I'll provide them the next time the issue occurs.

aneeshk-citrix commented 2 weeks ago

@Xufuru, Is the network name vnt-itn-tecp-001-VDRTEST112 same in Azure or is the casing different? (Pascal Case, all uppercase/lowercase)?

@j7lloyd It could be a contributing factor if the initial call populates the cache and subsequent calls fetch from cache using case-insensitive search.

j7lloyd commented 2 weeks ago

@aneeshk-citrix, fair point. I'll make sure to check that the case matches the next time I create or reference a new resource.

Xufuru commented 2 weeks ago

It's the same as i write down