hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.53k stars 9.52k forks source link

Habitat Provisioner crashes when using object with boolean on accept_license var #23784

Closed kmott closed 4 years ago

kmott commented 4 years ago

Terraform Version

Currently running terraform v0.12.18.

Terraform Configuration Files

//
// This is my main.tf with my module consumer
//
module "dummy" {
  source = "../../modules/vsphere-machine"

  vsphere = { ... }

  machine = { ... }

  habitat = {
    accept_license     = true
    version            = "0.79.1"
    peers              = []
    hab_ssh_username   = "root"
    hab_ssh_password   = "bacon"
    use_sudo           = false
    permanent_peer     = false
    [ ... ]
  }
}

//
// Module definition (variable)
//
variable "habitat" {
  type = object({
    accept_license = bool
    version = string
    peers = list(string)
    hab_ssh_username = string
    hab_ssh_password = string
    use_sudo = bool
    permanent_peer = bool
    [ ... ]
  })
}

//
// Module definition (machine objects)
//
resource "vsphere_virtual_machine" "vm" {
  count = var.machine.count
  [ ... ]

  provisioner "habitat" {
    accept_license = var.habitat.accept_license
    version = var.habitat.version
    peers = var.habitat.peers
    use_sudo = var.habitat.use_sudo
    permanent_peer = var.habitat.permanent_peer
    [ ... ]
  }
}

Debug Output

I can provide trace debug output via direct email.

Crash Output

I can provide crash log output via direct email.

Expected Behavior

Terraform should not crash.

Actual Behavior

Terraform crashed and was unable to plan or apply.

Note that I can work around the issue by hard-coding accept_license to true in the provisioner block.

Change this:

  provisioner "habitat" {
    accept_license = var.habitat.accept_license
    version = var.habitat.version
    peers = var.habitat.peers
    [ ... ]
  }

To this:

provisioner "habitat" {
    accept_license = true
    version = var.habitat.version
    peers = var.habitat.peers
    [ ... ]
  }

And terraform plan / apply works just fine.

Steps to Reproduce

tf plan with an object containing a bool for a Habitat Provisioner accept_license attribute set to true.

Additional Context

Error: rpc error: code = Unavailable desc = transport is closing

panic: interface conversion: interface {} is string, not bool
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: 
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: goroutine 14 [running]:
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: github.com/hashicorp/terraform/builtin/provisioners/habitat.validateFn(0xc000463500, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/src/github.com/hashicorp/terraform/builtin/provisioners/habitat/resource_provisioner.go:372 +0xae2
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: github.com/hashicorp/terraform/helper/schema.(*Provisioner).Validate(0xc0000b40a0, 0xc000463500, 0x1b51b00, 0xc0004634a0, 0xc000142250, 0xc000463500, 0xc0005d40b8, 0x1b51b00)
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/src/github.com/hashicorp/terraform/helper/schema/provisioner.go:199 +0x25b
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: github.com/hashicorp/terraform/helper/plugin.(*GRPCProvisionerServer).ValidateProvisionerConfig(0xc000466010, 0x2364220, 0xc000462210, 0xc000462240, 0xc000466010, 0xc000462210, 0xc0005c0ba8)
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/src/github.com/hashicorp/terraform/helper/plugin/grpc_provisioner.go:58 +0x188
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: github.com/hashicorp/terraform/internal/tfplugin5._Provisioner_ValidateProvisionerConfig_Handler(0x1c23060, 0xc000466010, 0x2364220, 0xc000462210, 0xc0005da190, 0x0, 0x2364220, 0xc000462210, 0xc00013e3c0, 0x1c4)
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/src/github.com/hashicorp/terraform/internal/tfplugin5/tfplugin5.pb.go:3442 +0x23e
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: google.golang.org/grpc.(*Server).processUnaryRPC(0xc00056c300, 0x2385540, 0xc000546600, 0xc00013c000, 0xc00048c450, 0x34bc178, 0x0, 0x0, 0x0)
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/pkg/mod/google.golang.org/grpc@v1.21.1/server.go:998 +0x470
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: google.golang.org/grpc.(*Server).handleStream(0xc00056c300, 0x2385540, 0xc000546600, 0xc00013c000, 0x0)
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/pkg/mod/google.golang.org/grpc@v1.21.1/server.go:1278 +0xda6
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0005d6350, 0xc00056c300, 0x2385540, 0xc000546600, 0xc00013c000)
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/pkg/mod/google.golang.org/grpc@v1.21.1/server.go:717 +0x9f
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: created by google.golang.org/grpc.(*Server).serveStreams.func1
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18:  /opt/teamcity-agent/work/9e329aa031982669/pkg/mod/google.golang.org/grpc@v1.21.1/server.go:715 +0xa1
2020-01-04T16:31:41.712-0800 [DEBUG] plugin.terraform-0.12.18: 
2020/01/04 16:31:41 [WARN] module.dummy: eval: *terraform.EvalValidateProvisioner, non-fatal err: rpc error: code = Unavailable desc = transport is closing
2020/01/04 16:31:41 [ERROR] module.dummy: eval: *terraform.EvalSequence, err: rpc error: code = Unavailable desc = transport is closing
2020/01/04 16:31:41 [TRACE] [walkValidate] Exiting eval tree: module.dummy.vsphere_virtual_machine.vm
2020/01/04 16:31:41 [TRACE] vertex "module.dummy.vsphere_virtual_machine.vm": visit complete
2020/01/04 16:31:41 [TRACE] dag/walk: upstream of "provider.vsphere (close)" errored, so skipping
2020/01/04 16:31:41 [TRACE] dag/walk: upstream of "meta.count-boundary (EachMode fixup)" errored, so skipping
2020/01/04 16:31:41 [TRACE] dag/walk: upstream of "provisioner.habitat (close)" errored, so skipping
2020/01/04 16:31:41 [TRACE] dag/walk: upstream of "root" errored, so skipping
2020/01/04 16:31:41 [TRACE] statemgr.Filesystem: removing lock metadata file .terraform.tfstate.lock.info
2020/01/04 16:31:41 [TRACE] statemgr.Filesystem: unlocking terraform.tfstate using fcntl flock

References

None that I'm aware of.

kmott commented 4 years ago

I worked around the issue by hard-coding the value to true, and ran a few more tests, but it fails to re-converge if the Habitat service is already running on the machine. I am working on a quick patch for that, and should have it ready shortly.

kmott commented 4 years ago

I have a partial fix in place (at least for the failed run part) here: https://github.com/hashicorp/terraform/pull/23805

I am at a bit of a loss on the other parts (crash in resource_provisioner.go:372) though.

kmott commented 4 years ago

@pselle, would you mind helping me out with this issue? I cleaned up the previous set of commits (sorry it was a bit of a mess, it should be okay now though).

I don't know how to address the crasher above though in resource_provisioner.go:372--do I need to do a different cast of some sort at that line?

if ok && !acceptLicense.(bool) {

pselle commented 4 years ago

[caveat, going by what I'm reading here, but have not run this]

If we're getting a crash in the type casting, a Go-y option would be to do type checking a la:

if accept, ok := acceptLicense.(bool); ok { // use accept here if you need to, or assign it to _ if you don't need it

Hope this might help? I'll try and keep an eye on this issue/PR. Side context is that we're now working on 0.13 on master, but can cherry-pick critical fixes (and since this is a crash, I think I can advocate for it) to go into 0.12 patch releases. But if we're slow to respond, it's because further 0.12 work is lower in priority than 0.13 work. I'm sharing this in case it helps you prioritize what you're considering contributing in the next month or two :)

kmott commented 4 years ago

@pselle I made a slight adjustment, checking for the hcl2shim value (which is what was causing the crash), but now, when I go to deploy a newer version of Habitat but I do not set accept_license = true, the Terraform run pauses forever, waiting for user input (which I think was what the original author was trying to avoid by adding the acceptLicense check in validateFn, and failing early if it's not set to true).

@mudash, do you have any suggestions on how to workaround/fix this?

danieldreier commented 4 years ago

I'm closing this issue because we announced tool-specific (vendor or 3rd-party) provisioner deprecation in mid-September 2020. Additionally, we added a deprecation notice for tool-specific provisioners in 0.13.4. On a practical level this means we will no longer be reviewing or merging PRs for these built-in plugins.

The discuss post linked above explains this in more depth, but the basic reason we're making this change is that these vendor provisioners have been extremely challenging for us to maintain, and are a weak spot in the terraform user experience. People reach for them not realizing the bugs and UX limitations, and they're areas that are difficult for us to maintain because of the huge surface area of integrating with a bunch of different tools (Puppet, Chef, Salt, Habitat, etc) that each require deep domain knowledge to do right. For example, testing each of these against all the versions of those tools, on multiple platforms, is prohibitive, and so we don't - but users have a reasonable expectation that everything in the Terraform Core codebase is well tested.

For the time being, the best option if you want to see this built, is to build a standalone habitat provisioner, fix this in it, and distribute it as a plugin binary, similar to how the ansible provisioner is distributed.

I'm aware of the limitations of this approach, but it's the best option compared to coupling provisioner development to the Terraform Core release lifecycle. We believe the benefit to users of having provisioner development decoupled from core, exceeds the convenience of having these provisioners built in to core. We want to provide a better user experience in the future, and our hope here is that the ability to improve, fix and repair provisioners without us blocking their development, much like providers, will help make a strong case for what's next.

I think it’s also important to highlight that we have no plans to remove the generic provisioners or the pluggable functionality during Terraform's 1.0 lifecycle.

I appreciate your input here to improve Terraform, and am always happy to talk. Please feel free to reach out to me or Petros Kolyvas if you would like to talk more about this change.

ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.