azdevops / terraform

Terraform is a tool for building, changing, and combining infrastructure safely and efficiently.
https://www.terraform.io/
Mozilla Public License 2.0
1 stars 2 forks source link

When VM provisioning fails, the VM can neither be destroyed or recreated. #25

Open azdevops opened 7 years ago

azdevops commented 7 years ago

https://github.com/azdevops/terraform/issues/25 When a VM provisioning ends in a failed state, we found that the VM is not in the terraform state. This results in terraform failing to provision again,and failing to destroy,when NIC is still attached to the failed VM.

azdevops commented 7 years ago

Do we have an issue for this in master that we can reference? Else could whoever picks this up, please create one in the master and link to it from here?

azdevops commented 7 years ago

When VM provisioning ends in a FAILED state, the VM still exists on Azure, but is not written to Terraform state. You have to manually delete both the VM and the boot disk from Azure portal.

abhijeetgaiha commented 7 years ago

Master issue: https://github.com/hashicorp/terraform/issues/15143

whiskeyjay commented 7 years ago

This might also be a related one: https://github.com/hashicorp/terraform/issues/14636

abhijeetgaiha commented 7 years ago

Haven't been able to reproduce it yet. Talking to some devs about how to reproduce the failure, specifically how to make the API return an error message.

@echuvyrov Did you get any VM config from the customer? Might be an issue that's possible to reproduce with their specific VM settings.

StephenWeatherford commented 7 years ago

@abhijeetgaiha I've also seen this quite a few times and agree it's annoying. Let me find a repro.

I think what needs to happen is that we need to add support for "tainted" VMs.

StephenWeatherford commented 7 years ago

This essentially happens for anything that gives an error after the VM itself has been created. I think the encryption stuff hits it, for instance. Other good possibilities would be custom data or VM extensions.

From: Abhijeet Gaiha [mailto:notifications@github.com] Sent: Tuesday, June 13, 2017 1:55 AM To: azdevops/terraform terraform@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [azdevops/terraform] When VM provisioning fails, the VM can neither be destroyed or recreated. (#25)

Haven't been able to reproduce it yet. Talking to some devs about how to reproduce the failure, specifically how to make the API return an error message.

@echuvyrovhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fechuvyrov&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=FtUovtKrge0UWCV7Bpgd4hJObfpFWV2kio3cgQjISWw%3D&reserved=0 Did you get any VM config from the customer? Might be an issue that's possible to reproduce with their specific VM settings.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fazdevops%2Fterraform%2Fissues%2F25%23issuecomment-308051754&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=n4AMqYUMrylvoamsLfbaBOqYf2PkW%2BUlezD0TolsWS0%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGl9SksHXoy3IdFYrrhulcqnrvKg0kTdks5sDk5_gaJpZM4N0qaL&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=rFBqztgWV8uUPaiGi6ei9gqhw025Svat%2FNkoEYh12HU%3D&reserved=0.

StephenWeatherford commented 7 years ago

VM extensions didn’t cause this problem since they’re a separate resource… Haven’t tried anything else.

From: Abhijeet Gaiha [mailto:notifications@github.com] Sent: Tuesday, June 13, 2017 1:55 AM To: azdevops/terraform terraform@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [azdevops/terraform] When VM provisioning fails, the VM can neither be destroyed or recreated. (#25)

Haven't been able to reproduce it yet. Talking to some devs about how to reproduce the failure, specifically how to make the API return an error message.

@echuvyrovhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fechuvyrov&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=FtUovtKrge0UWCV7Bpgd4hJObfpFWV2kio3cgQjISWw%3D&reserved=0 Did you get any VM config from the customer? Might be an issue that's possible to reproduce with their specific VM settings.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fazdevops%2Fterraform%2Fissues%2F25%23issuecomment-308051754&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=n4AMqYUMrylvoamsLfbaBOqYf2PkW%2BUlezD0TolsWS0%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGl9SksHXoy3IdFYrrhulcqnrvKg0kTdks5sDk5_gaJpZM4N0qaL&data=02%7C01%7CStephen.Weatherford%40microsoft.com%7C58048a5cf732459465e208d4b239f0b4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636329409299228805&sdata=rFBqztgWV8uUPaiGi6ei9gqhw025Svat%2FNkoEYh12HU%3D&reserved=0.

StephenWeatherford commented 7 years ago

I don't have time to find a repro right now, if someone knows how to repro, please provide config.

StephenWeatherford commented 7 years ago

NEW LINK in new repo: https://github.com/azdevops/terraform/issues/25

abhijeetgaiha commented 7 years ago

@StephenWeatherford Thanks Stephen. This is a tricky one to debug.

I talked to some people here to understand how to repro this. The error occurs within the fabric layer and not the API layer, since the VM is created but the request still fails. Simulating this kind of failure in the fabric itself is difficult.

We could also try to source a test config from a customer that has a propensity to fail.