kubernetes-retired / kubernetes-anywhere

[EOL] {concise,reliable,cross-platform} turnup of Kubernetes clusters
Apache License 2.0
532 stars 196 forks source link

make deploy fails on "* data.tls_cert_request.kubernetes-master: unexpected EOF" #312

Closed tphakala closed 6 years ago

tphakala commented 7 years ago

I have had one successful Kubernetes deployment on vSphere 6. I removed it with "make destroy" and tried to redeploy but it is now failing on TLS error. Configuration is still set to allow self signed certificates.

container:/opt/kubernetes-anywhere> make clean rm -rf .tmp rm -rf phase3/.tmp rm -rf phase1/gce/.tmp rm -rf phase1/azure/.tmp rm -rf phase1/vsphere/.tmp

container:/opt/kubernetes-anywhere> make deploy util/config_to_json .config > .config.json make do WHAT=deploy-cluster make[1]: Entering directory '/opt/kubernetes-anywhere' ( cd "phase1/$(jq -r '.phase1.cloud_provider' .config.json)"; ./do deploy-cluster ) .tmp/vSphere-kubernetes.tf data.template_file.cloudprovider: Refreshing state... tls_private_key.kubernetes-root: Creating... algorithm: "" => "RSA" ecdsa_curve: "" => "P224" private_key_pem: "" => "" public_key_openssh: "" => "" public_key_pem: "" => "" rsa_bits: "" => "2048" tls_private_key.kubernetes-node: Creating... algorithm: "" => "RSA" ecdsa_curve: "" => "P224" private_key_pem: "" => "" public_key_openssh: "" => "" public_key_pem: "" => "" rsa_bits: "" => "2048" tls_private_key.kubernetes-master: Creating... algorithm: "" => "RSA" ecdsa_curve: "" => "P224" private_key_pem: "" => "" public_key_openssh: "" => "" public_key_pem: "" => "" rsa_bits: "" => "2048" tls_private_key.kubernetes-admin: Creating... algorithm: "" => "RSA" ecdsa_curve: "" => "P224" private_key_pem: "" => "" public_key_openssh: "" => "" public_key_pem: "" => "" rsa_bits: "" => "2048" vsphere_folder.cluster_folder: Creating... datacenter: "" => "Derby" existing_path: "" => "" path: "" => "kubernetes" vsphere_folder.cluster_folder: Creation complete vsphere_virtual_machine.kubevm1: Creating... .. .. vsphere_virtual_machine.kubevm5: Still creating... (2m10s elapsed) vsphere_virtual_machine.kubevm4: Still creating... (2m10s elapsed) vsphere_virtual_machine.kubevm2: Still creating... (2m10s elapsed) vsphere_virtual_machine.kubevm5: Still creating... (2m20s elapsed) vsphere_virtual_machine.kubevm4: Still creating... (2m20s elapsed) vsphere_virtual_machine.kubevm2: Still creating... (2m20s elapsed) vsphere_virtual_machine.kubevm2: Creation complete vsphere_virtual_machine.kubevm4: Creation complete vsphere_virtual_machine.kubevm5: Creation complete Error applying plan:

1 error(s) occurred:

Terraform does not automatically rollback in the face of errors. Instead, your Terraform state file has been partially updated with any resources that successfully completed. Please address the error above and apply again to incrementally change your infrastructure. Makefile:63: recipe for target 'do' failed make[1]: [do] Error 1 make[1]: Leaving directory '/opt/kubernetes-anywhere' Makefile:41: recipe for target 'deploy-cluster' failed make: [deploy-cluster] Error 2

adtrsa commented 7 years ago

Encountered exact same error (data.tls_cert_request... : unexpected EOF) with self-signed host certificate option enabled.

VMs are created but cluster setup incomplete.

Using vSphere 6.0 and latest version of kubernetes-anywhere.

adtrsa commented 7 years ago

vsphere_virtual_machine.kubevm2: Creation complete Error applying plan:

1 error(s) occurred:

Terraform does not automatically rollback in the face of errors. Instead, your Terraform state file has been partially updated with any resources that successfully completed. Please address the error above and apply again to incrementally change your infrastructure. panic: interface conversion: interface is nil, not string 2017/01/17 13:54:48 [DEBUG] plugin: terraform: 2017/01/17 13:54:48 [DEBUG] plugin: terraform: goroutine 103 [running]: 2017/01/17 13:54:48 [DEBUG] plugin: terraform: panic(0x2876840, 0xc42045b2c0) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/go/src/runtime/panic.go:500 +0x1a1 2017/01/17 13:54:48 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/builtin/providers/tls.ReadCertRequest(0xc420428420, 0x0, 0x0, 0x28, 0xc4204308f1) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/gopath/src/github.com/hashicorp/terraform/builtin/providers/tls/data_source_cert_request.go:94 +0x448 2017/01/17 13:54:48 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/helper/schema.(Resource).ReadDataApply(0xc420316ba0, 0xc420614100, 0x0, 0x0, 0xc42031acb8, 0x1, 0x18) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/gopath/src/github.com/hashicorp/terraform/helper/schema/resource.go:207 +0xda 2017/01/17 13:54:48 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/helper/schema.(Provider).ReadDataApply(0xc4203beb70, 0xc42045a100, 0xc420614100, 0x0, 0x0, 0x0) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/gopath/src/github.com/hashicorp/terraform/helper/schema/provider.go:315 +0x91 2017/01/17 13:54:48 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/plugin.(ResourceProviderServer).ReadDataApply(0xc4203197c0, 0xc42049c0b0, 0xc42049c320, 0x0, 0x0) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/gopath/src/github.com/hashicorp/terraform/plugin/resource_provider.go:537 +0x4e 2017/01/17 13:54:48 [DEBUG] plugin: terraform: reflect.Value.call(0xc4203b45a0, 0xc4203d8088, 0x13, 0x2da4800, 0x4, 0xc4204a1ed0, 0x3, 0x3, 0xc, 0xc, ...) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/go/src/reflect/value.go:434 +0x5c8 2017/01/17 13:54:48 [DEBUG] plugin: terraform: reflect.Value.Call(0xc4203b45a0, 0xc4203d8088, 0x13, 0xc4204a1ed0, 0x3, 0x3, 0x180002, 0x0, 0x0) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/go/src/reflect/value.go:302 +0xa4 2017/01/17 13:54:48 [DEBUG] plugin: terraform: net/rpc.(service).call(0xc420458b00, 0xc420458ac0, 0xc420430d38, 0xc4203ce400, 0xc4201fab40, 0x255f880, 0xc42049c0b0, 0x16, 0x255f8c0, 0xc42049c320, ...) 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/go/src/net/rpc/server.go:383 +0x148 2017/01/17 13:54:48 [DEBUG] plugin: terraform: created by net/rpc.(Server).ServeCodec 2017/01/17 13:54:48 [DEBUG] plugin: terraform: /opt/go/src/net/rpc/server.go:477 +0x421 2017/01/17 13:54:48 [DEBUG] plugin: /bin/terraform: plugin process exited 2017/01/17 13:54:48 [ERROR] root: eval: terraform.EvalReadDataApply, err: data.tls_cert_request.cluster-master: unexpected EOF 2017/01/17 13:54:48 [ERROR] root: eval: terraform.EvalSequence, err: data.tls_cert_request.cluster-master: unexpected EOF 2017/01/17 13:54:48 [ERROR] root: eval: terraform.EvalOpFilter, err: data.tls_cert_request.cluster-master: unexpected EOF 2017/01/17 13:54:48 [ERROR] root: eval: *terraform.EvalSequence, err: data.tls_cert_request

adtrsa commented 7 years ago

Still getting this crash. Updated to latest master, using v1.4.8 kubernetes (as recommended for vsphere provider) etc. If self-signed option is selected for host, why is data.tls_cert_request.kubernetes-master necessary? Or is this when an attempt is made to retrieve the master cert? Currently resorting to running terraform 0.7.2 in dlv to try and debug what's up.

stefanhdao commented 7 years ago

I solved this by deleting the VM from the OVA template and re-adding it to vsphere. Not sure why that made a difference.

abrarshivani commented 7 years ago

@adtrsa @stefanhdao @thakala Were you able to destroy cluster successfully?

stefanhdao commented 7 years ago

@abrarshivani Here and there. Sometimes running make destroy destroyed the cluster successfully. Other times I had to intervene and remove the VMs from vsphere myself in order for make destroy to succeed.

abrarshivani commented 7 years ago

@stefanhdao Yes that is know issue. You found this issue when you manually deleted the VMs or when the make destroy worked fine?

stefanhdao commented 7 years ago

@abrarshivani I don't remember the exact build up to this specific issue. It happened after several iterations of doing make destroy which involved both manually deleting and make destroy completing all the way through.

wkrapohl commented 7 years ago

Any updates on this issue. I can't seem to get by this problem using kubernetes-anywhere?

alejdg commented 7 years ago

I'm in the same situation as @wkrapohl. I tried all the alternatives mentioned here but none could make the deploy work.

aaron-comyn commented 7 years ago

Same issue here.

This error message came up due to improper DHCP configuration.

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle stale

fejta-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale

phimic commented 6 years ago

Any updates about this issue? I never could build the kubernetes cluster with kubernetes-anywhere, my error is

tls_cert_request.kubernetes-master: 1 error(s) occurred:
tls_cert_request.kubernetes-master: unexpected EOF

VMware vSphere 6U3 with latest VCSA build (7462485)

fejta-bot commented 6 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close