Open azdevops opened 7 years ago
Do we have an issue for this in master that we can reference? Else could whoever picks this up, please create one in the master and link to it from here?
When VM provisioning ends in a FAILED state, the VM still exists on Azure, but is not written to Terraform state. You have to manually delete both the VM and the boot disk from Azure portal.
Master issue: https://github.com/hashicorp/terraform/issues/15143
This might also be a related one: https://github.com/hashicorp/terraform/issues/14636
Haven't been able to reproduce it yet. Talking to some devs about how to reproduce the failure, specifically how to make the API return an error message.
@echuvyrov Did you get any VM config from the customer? Might be an issue that's possible to reproduce with their specific VM settings.
@abhijeetgaiha I've also seen this quite a few times and agree it's annoying. Let me find a repro.
I think what needs to happen is that we need to add support for "tainted" VMs.
This essentially happens for anything that gives an error after the VM itself has been created. I think the encryption stuff hits it, for instance. Other good possibilities would be custom data or VM extensions.
From: Abhijeet Gaiha [mailto:notifications@github.com] Sent: Tuesday, June 13, 2017 1:55 AM To: azdevops/terraform terraform@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [azdevops/terraform] When VM provisioning fails, the VM can neither be destroyed or recreated. (#25)
Haven't been able to reproduce it yet. Talking to some devs about how to reproduce the failure, specifically how to make the API return an error message.
@echuvyrovhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fechuvyrov&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=FtUovtKrge0UWCV7Bpgd4hJObfpFWV2kio3cgQjISWw%3D&reserved=0 Did you get any VM config from the customer? Might be an issue that's possible to reproduce with their specific VM settings.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fazdevops%2Fterraform%2Fissues%2F25%23issuecomment-308051754&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=n4AMqYUMrylvoamsLfbaBOqYf2PkW%2BUlezD0TolsWS0%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGl9SksHXoy3IdFYrrhulcqnrvKg0kTdks5sDk5_gaJpZM4N0qaL&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=rFBqztgWV8uUPaiGi6ei9gqhw025Svat%2FNkoEYh12HU%3D&reserved=0.
VM extensions didn’t cause this problem since they’re a separate resource… Haven’t tried anything else.
From: Abhijeet Gaiha [mailto:notifications@github.com] Sent: Tuesday, June 13, 2017 1:55 AM To: azdevops/terraform terraform@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [azdevops/terraform] When VM provisioning fails, the VM can neither be destroyed or recreated. (#25)
Haven't been able to reproduce it yet. Talking to some devs about how to reproduce the failure, specifically how to make the API return an error message.
@echuvyrovhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fechuvyrov&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=FtUovtKrge0UWCV7Bpgd4hJObfpFWV2kio3cgQjISWw%3D&reserved=0 Did you get any VM config from the customer? Might be an issue that's possible to reproduce with their specific VM settings.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fazdevops%2Fterraform%2Fissues%2F25%23issuecomment-308051754&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=n4AMqYUMrylvoamsLfbaBOqYf2PkW%2BUlezD0TolsWS0%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGl9SksHXoy3IdFYrrhulcqnrvKg0kTdks5sDk5_gaJpZM4N0qaL&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=rFBqztgWV8uUPaiGi6ei9gqhw025Svat%2FNkoEYh12HU%3D&reserved=0.
I don't have time to find a repro right now, if someone knows how to repro, please provide config.
NEW LINK in new repo: https://github.com/azdevops/terraform/issues/25
@StephenWeatherford Thanks Stephen. This is a tricky one to debug.
I talked to some people here to understand how to repro this. The error occurs within the fabric layer and not the API layer, since the VM is created but the request still fails. Simulating this kind of failure in the fabric itself is difficult.
We could also try to source a test config from a customer that has a propensity to fail.
https://github.com/azdevops/terraform/issues/25 When a VM provisioning ends in a failed state, we found that the VM is not in the terraform state. This results in terraform failing to provision again,and failing to destroy,when NIC is still attached to the failed VM.