Open aurel333 opened 1 year ago
Thanks for raising this issue. I assume the error is expected. You only set the dependency between vnet and vhub connection but you didn't explicitly set the dependency between subnet and vhub connection. So it failed to create vhub connection since the subnet is still in creation while creating vhub connection. Hence ,suggest add "depends_on = [azurerm_subnet.subnet1]" on vhub connection.
Hello, thank you for your quick answer. I am raising this issue mainly because I think it should not be required to add the depends_on = [<subnets>]
as it is not a custom dependency nor does it seem hidden as creating a azurerm_virtual_hub_connection
should be done at the same as a subnet.
If I am wrong can you please tell me which type of dependency typically can or cannot be handled by Terraform, this will allow me to avoid problems in the future.
Hello,
I have dug a bit more on how dependencies were handled and you are right it will not be good to make a azurerm_virtual_hub_connection
automatically dependent on the azurerm_subnet
so a depends_on
keyword is necessary to do this.
However there is still a problem as creating to subnets or attaching two vnet peerings to the same vnet works fine. I saw in the code that these situations are handled by locking the vnet for the duration of the operation and I also see that the same locking mechanism has been implemented for the azurerm_virtual_hub_connection
resource (here).
There is also the issue #12998 (that I did not see before) about the same issue so the error I am seeing should have been fixed in a previous provider version. Do you have any idea what can make the lock not working as expected?
I have done several more tests with a modified version of the provider based on ee4d44ac0133709272b268337c0f673d999c46b5 to have more targeted logs.
It confirmed that the locking process is what is causing the issue as we have two different mutexes locking the same thing:
...
2023-01-23T16:22:44.024+0100 [DEBUG]: Locked "azurerm_virtual_network.vnet" with mutex 0xc00024cfa0: timestamp=2023-01-23T16:22:44.024+0100
...
2023-01-23T16:22:44.031+0100 [DEBUG]: Locking "azurerm_virtual_network.vnet" with mutex 0xc000496000: timestamp=2023-01-23T16:22:44.031+0100
2023-01-23T16:22:44.031+0100 [DEBUG]: Locked "azurerm_virtual_network.vnet" with mutex 0xc000496000: timestamp=2023-01-23T16:22:44.031+0100
...
2023-01-23T16:22:48.148+0100 [DEBUG]: Unlocking "azurerm_virtual_network.vnet" with mutex 0xc00024cfa0: timestamp=2023-01-23T16:22:48.148+0100
...
At first i thought it was because I put the virtual hub connection inside a module but the issue still arise with just the resources directly in the files too.
As I am not used to the inner workings of the provider I think I will not be able to understand how the mutex system is working alone. Can you help me figure out why two mutexes are created?
Continuing with the testing, I have done a test build with the option -parallelism=1
and again I got two different mutexes for the subnet creation and the virtual hub connection.
This does not confirm the hypothesis that there is one mutex per provider but goes very strongly in this direction. If this is true then it is a troublesome issue as the provider instantiation seems to be done by the Terraform core once per provider (which makes sense by the way) and using two or more providers is required to work on multiple subscriptions.
So this means that any cross subscription peering or virtual hub connection is at risk of ending in error if another operation that locks the virtual network is done at the same time (typically a subnet creation or a peering to a different subscription).
Such a risk is either a bug or should be documented somewhere. I can write the documentation but I need first a confirmation that this is not considered a bug.
Is there an existing issue for this?
Community Note
Terraform Version
1.1.9
AzureRM Provider Version
3.5.0
Affected Resource(s)/Data Source(s)
azurerm_virtual_hub_connection
Terraform Configuration Files
Debug Output/Panic Output
Expected Behaviour
Create first the VNET then the subnets and the Virtual Hub Connection in whatever order but not at the same time with them being added correctly in the state.
Actual Behaviour
The VNET and the subnets are created correctly but the Virtual Hub Connection is in failed state and NOT added to the state. The error output looks like a C# stacktrace and is not easily understandable.
Steps to Reproduce
->I managed to almost reliably reproduce the issue by deleting the resources and immediately recreate them. So here are the commands to do.
Please note that sometimes the problem does not appear, so maybe it is linked to the Azure backend speed to do the operations.
Important Factoids
A ticket to the Azure Support has been opened first and adding "depends_on" to make the VirtualHubConnection resource dependent on the subnet and the vnet is a reliable workaround. However this is not a custom dependency so it should not be required.
References
No response