hashicorp / terraform-provider-vsphere

Terraform Provider for VMware vSphere
https://registry.terraform.io/providers/hashicorp/vsphere/
Mozilla Public License 2.0
612 stars 449 forks source link

vsphere terraform 0.10.6 issue #183

Closed rismoney closed 6 years ago

rismoney commented 7 years ago

Using the vsphere module to create a simple vm, I am on v.0.10.6

This looks very similar to an issue (I'm on windows) but it was reported a few hours ago, and was referred to digitalocean provider- https://github.com/hashicorp/terraform/issues/16220

Error applying plan:

1 error(s) occurred:

* module.igen-nj-wks01.vsphere_virtual_machine.wks: 1 error(s) occurred:

* vsphere_virtual_machine.wks: Post https://tp-nj-vc-01.mydom.corp/sdk: EOF

Trace log showS:

2017/09/29 17:35:05 [TRACE] dag/walk: vertex "root", waiting for: "meta.count-boundary (count boundary fixup)"
2017/09/29 17:35:05 [TRACE] dag/walk: vertex "provider.vsphere (close)", waiting for: "module.igen-nj-wks01.vsphere_virtual_machine.wks"
2017/09/29 17:35:07 [TRACE] root.igen-nj-wks01: eval: *terraform.EvalWriteState
2017/09/29 17:35:07 [TRACE] root.igen-nj-wks01: eval: *terraform.EvalApplyProvisioners
2017/09/29 17:35:07 [TRACE] root.igen-nj-wks01: eval: *terraform.EvalIf
2017/09/29 17:35:07 [TRACE] root.igen-nj-wks01: eval: *terraform.EvalWriteState
2017/09/29 17:35:07 [TRACE] root.igen-nj-wks01: eval: *terraform.EvalWriteDiff
2017/09/29 17:35:07 [TRACE] root.igen-nj-wks01: eval: *terraform.EvalApplyPost
2017/09/29 17:35:07 [ERROR] root.igen-nj-wks01: eval: *terraform.EvalApplyPost, err: 1 error(s) occurred:

* vsphere_virtual_machine.wks: Post https://tp-nj-vc-01.mydom.corp/sdk: EOF
2017/09/29 17:35:07 [ERROR] root.igen-nj-wks01: eval: *terraform.EvalSequence, err: 1 error(s) occurred:

* vsphere_virtual_machine.wks: Post https://tp-nj-vc-01.mydom.corp/sdk: EOF
2017/09/29 17:35:07 [TRACE] [walkApply] Exiting eval tree: module.igen-nj-wks01.vsphere_virtual_machine.wks
2017/09/29 17:35:07 [TRACE] dag/walk: upstream errored, not walking "meta.count-boundary (count boundary fixup)"
2017/09/29 17:35:07 [TRACE] dag/walk: upstream errored, not walking "provider.vsphere (close)"
2017/09/29 17:35:07 [TRACE] dag/walk: upstream errored, not walking "root"
2017/09/29 17:35:07 [TRACE] Preserving existing state lineage "f6c9f055-1640-4d42-a6fe-ccc483faeeb9"
2017/09/29 17:35:07 [TRACE] Preserving existing state lineage "f6c9f055-1640-4d42-a6fe-ccc483faeeb9"
2017/09/29 17:35:07 [TRACE] Preserving existing state lineage "f6c9f055-1640-4d42-a6fe-ccc483faeeb9"
2017/09/29 17:35:07 [TRACE] Preserving existing state lineage "f6c9f055-1640-4d42-a6fe-ccc483faeeb9"
2017/09/29 17:35:07 [DEBUG] Uploading remote state to S3: {
rismoney commented 7 years ago

Additional informaton: This error pops up around 4.5-5minutes into the run - the box is fully created, and guest customization is underway (win10 OS)

It does not happen on 0.1 of the provider, but does every version there-after.

vancluever commented 7 years ago

Hey @rismoney, thanks for the report.

Do you mind running your debug run with VSPHERE_CLIENT_DEBUG=1? After this there should be a .govmomi directory with a debug directory in it with all of the various client sessions - can you see if you can locate the last 1-000n.req.xml file and send me the result? You may need to sanitize it.

This should at least give us an idea on what POST call that this is coming from.

The DO provider issue is actually an issue with core as @radeksimko has mentioned here and probably is unrelated to what is going on here (unless you are using the split interpolation function). This seems to be more an issue with a POST call getting an unexpected EOF from vsphere (and with that said I have to make the obligatory request to make sure that everything is ok with your network and there'd be no reason for the API connection to drop).

Let me know when you've located that API call, and thanks!

rismoney commented 7 years ago

I dont believe there to be any problem with network or permissions (the user currently has every right in vsphere currently). I am not using split function at this time. This is vsphere 6.0.0 3634793

Below are the contents of the last 3 xml from the .govmomi. Let me know if you need anything else or steps I should take

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/"><Body><WaitForUpdatesEx xmlns="urn:vim25"><_this type="PropertyCollector",>session[52ee23e8-7774-3b8d-3da6-dcbbb2051f1f]52fa8390-375f-f8a9-2145-0042acd0756b</_this><version>7</version></WaitForUpdatesEx></Body></Envelope>
<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/"><Body><WaitForUpdatesEx xmlns="urn:vim25"><_this type="PropertyCollector">session[52ee23e8-7774-3b8d-3da6-dcbbb2051f1f]52fa8390-375f-f8a9-2145-0042acd0756b</_this><version>8</version></WaitForUpdatesEx></Body></Envelope>

This is the very last 1


<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/"><Body><DestroyPropertyCollector xmlns="urn:vim25"><_this type="PropertyCollector">session[52ee23e8-7774-3b8d-3da6-dcbbb2051f1f]52fa8390-375f-f8a9-2145-0042acd0756b</_this></DestroyPropertyCollector></Body></Envelope>
```xml
rismoney commented 6 years ago

Let me know what the next steps are to identify where the problem is.

rismoney commented 6 years ago

https://gist.github.com/rismoney/bbc35c0df0205c3b8eba48d33c791641

vancluever commented 6 years ago

Hey @rismoney, I think what we will just do is make the customization waiter timeout tuneable (with the option to skip the waiter altogether). I have a commit almost ready, just smoke testing it before I put in a PR.

vancluever commented 6 years ago

Hey @rismoney, there is a fix now for this (possibly) in #199. If you have the ability to do so, would you please try your issue again with a custom built provider binary against the branch in the PR to see that resolves the issue for you? This can help us confirm that the issue will be fixed in the next release.

You can disable the waiter by setting wait_for_customization_timeout to a zero or negative value - this will skip the event waiter completely, which should eliminate it as a source of problems. You can also set wait_for_guest_net to false to disable the networking waiter.

Thanks!

rismoney commented 6 years ago

i think i will need a linux node to build the binary. cannot get go install or make to work properly on windows. guessing the gopath pathing issues are problematic

rismoney commented 6 years ago

alright- so I added both those wait configuration disablers and it exited cleanly. since I use my own waiter mechanisms, this will work. I can certainly test other values/combinations but as I mentioned before, on Windows, Vsphere events for guest customization completion do not take into account a proper sysprep completion on host despite what vsphere reports. This is what led me to guest OS inspection based waiting.

I am stoked, because this now lets me take advantage of all new changes since the provider split! Thank you, and excellent job managing provider.

vancluever commented 6 years ago

Glad this worked out for you @rismoney!

Going to close this now as everything seems to be addressed :+1: enjoy the new features!