zclconf / go-cty

A type system for dynamic values in Go applications
MIT License
339 stars 70 forks source link

Converting cty.NilVal to cty.String panics #43

Closed azr closed 4 years ago

azr commented 4 years ago

Hello there ! We had this issue in packer: https://github.com/hashicorp/packer/issues/8730

Basically, the default value of a variable is not set but its type is and when a user tries to access that variable without setting it, a panic occurs.

This was actually tested in the Packer codebase and successfully errored with the following error:

Invalid template interpolation value; The expression result is null. Cannot include a null value in a string template.

But this panics 'in real life' because packer uses "github.com/zclconf/go-cty/cty/json".Marshal(value, cty.DynamicPseudoType)

I will try to open an PR to fix this, but I might fix something in the wrong place.

Here's a stack trace: ``` github.com/hashicorp/packer/vendor/github.com/zclconf/go-cty/cty.Type.Equals(0x0, 0x0, 0x84b1000, 0xc0004cd8f8, 0x41ccfd0) /Users/azr/go/src/github.com/hashicorp/packer/vendor/github.com/zclconf/go-cty/cty/type.go:42 +0xf5 github.com/hashicorp/packer/vendor/github.com/zclconf/go-cty/cty/convert.Convert(0x0, 0x0, 0x0, 0x0, 0x84b1000, 0xc0004cd8f8, 0x0, 0x0, 0x0, 0xcc74800, ...) /Users/azr/go/src/github.com/hashicorp/packer/vendor/github.com/zclconf/go-cty/cty/convert/public.go:43 +0x53 github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec.(*AttrSpec).decode(0xc0005c88a0, 0xc0003fcfc0, 0x0, 0x0, 0x0, 0xc00000f7a0, 0x0, 0x0, 0x0, 0x0, ...) /Users/azr/go/src/github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec/spec.go:209 +0x508 github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec.ObjectSpec.decode(0xc0001870b0, 0xc0003fcfc0, 0x0, 0x0, 0x0, 0xc00000f7a0, 0xc000618df0, 0x402fd21, 0x7eafba8, 0xc000618df0, ...) /Users/azr/go/src/github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec/spec.go:80 +0x238 github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec.decode(0x84b0240, 0xc0001b46e0, 0x0, 0x0, 0x0, 0xc00000f7a0, 0x84b0f00, 0xc0001870b0, 0x49d8e00, 0x8482000, ...) /Users/azr/go/src/github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec/decode.go:21 +0x118 github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec.Decode(...) /Users/azr/go/src/github.com/hashicorp/packer/vendor/github.com/hashicorp/hcl/v2/hcldec/public.go:15 github.com/hashicorp/packer/hcl2template.decodeHCL2Spec(0x84b0240, 0xc0001b46e0, 0xc00000f7a0, 0xce6f290, 0xc0002a6780, 0x0, 0x0, 0x0, 0xc000563800, 0xcc01b28, ...) /Users/azr/go/src/github.com/hashicorp/packer/hcl2template/decode.go:17 +0xa0 github.com/hashicorp/packer/hcl2template.(*Parser).startBuilder(0xc000619370, 0xc00020a9f0, 0xc00000f7a0, 0xc000228108, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /Users/azr/go/src/github.com/hashicorp/packer/hcl2template/types.source.go:54 +0x2e4 github.com/hashicorp/packer/hcl2template.(*Parser).getBuilds(0xc000619370, 0xc0000beb00, 0x0, 0x0, 0xc0000beb00, 0x0, 0x0, 0x0) /Users/azr/go/src/github.com/hashicorp/packer/hcl2template/types.packer_config.go:102 +0x1d5 github.com/hashicorp/packer/hcl2template.(*Parser).Parse(0xc000619370, 0x7ffeefbff6f1, 0x2e, 0x0, 0xc0002e62a8, 0xc0002e6310, 0x38, 0x40, 0xc0000bea00, 0xc00013bda0) /Users/azr/go/src/github.com/hashicorp/packer/hcl2template/types.packer_config.go:150 +0xd1 github.com/hashicorp/packer/command.(*BuildCommand).GetBuildsFromHCL(0xc000186e70, 0x7ffeefbff6f1, 0x2e, 0x0, 0x0, 0xc000563800, 0x1) /Users/azr/go/src/github.com/hashicorp/packer/command/build.go:107 +0x123 github.com/hashicorp/packer/command.(*BuildCommand).GetBuilds(0xc000186e70, 0x7ffeefbff6f1, 0x2e, 0x1, 0x1, 0x7fffffffffffffff, 0x0) /Users/azr/go/src/github.com/hashicorp/packer/command/build.go:136 +0x543 ```

Edit: I tried introducing Marshaling there and there and it did not trigger the panic.

apparentlymart commented 4 years ago

Hi @azr,

It's intentional that cty.NilVal and cty.NilType are not valid and will panic on any use; they represent the total absense of a value/type in the Go space, rather than a null in cty space, and so this is like calling a method on a nil interface in Go, or working with a zero reflect.Value in Go's own reflect package.

I think the bug here is in the HCL layer instead, but I'm not entirely sure where. The hcldec attribute spec seems to be checking for an unset attribute and returning a proper cty.NullVal (rather than the zero value), but the stack trace shows it managing to reach the convert.Convert call below, so it seems like this is a weird edge case where the attr.Expr.Value method is returning cty.NilVal.

The contract for Expression.Value is that it should always return a valid value, even on error, to allow for this sort of error recovery behavior. I'm honestly not sure why we'd even get to that when the attribute isn't set at all, so the bug might be further upstream, but at the very least it's incorrect for that Expression.Value call to be returning cty.NilVal... if it's totally failing and unable to produce any sort of result then it should at least return cty.DynamicVal to represent "I don't know".

Hopefully the above is a useful starting point! I'm not sure where exactly the issue is but I'm happy to give some more pointers if needed. Might be worth opening in issue for this over in the HCL repository so we can discuss it more over there.

azr commented 4 years ago

Hey Martin ! Thanks for the explanation.

You are right, this variable code that will return cty.NilVal, &hcl.Diagnostic{... when a variable is not set by a user. This totally felt like a return nil, error and felt correct. I wanted it to cause an error if a user used it.

This was handy on the unit testing side because it would be silent when a variable was unused and using the variable would error: The expression result is null. Cannot include a null value in a string template. with a link to variable definition usage.

But a Packer run, tries to get as far as possible, which caused this panic.

I changed it to return cty.DynamicVal, but now if a user uses the variable the error is weird and doesn't pinpoints to exactly where the problem happens:

Error: failed to read dynamic type descriptor value: invalid character ',' looking for beginning of value
[00] 
[00]   on ../.vscode/hcl2/crash/var_with_no_default.pkr.hcl line 18, in source "null" "null-builder":
[00]   18: source "null" "null-builder" {

I thought this was really missleading for users, so I tried something else


I added a piece of code that checks that all variables are set, but then this is also not the best because a user can define an unused variable but must set it.

Trying to look for a better option. For now my thinking is that it would be super nice if the hcl.EvalContext allowed to be more dynamic, for example:

type EvalContext interface {
    VariableValue(name string) (cty.Value, hcl.Diagnostics)
    Function(name string) (function.Function, hcl.Diagnostics)
    Parent()    *EvalContext
} 

But this is more of an hcl topic :D

apparentlymart commented 4 years ago

Hi @azr,

If you want to represent the absence of a value in "userspace" (that is, representing something absent in the space of the configuration language rather than in the space of the Go program implementing it) then you could use cty.NullVal(cty.DynamicPseudoType) or cty.NullVal(cty.String) (depending on how much type information you have) instead.

If you do that, then the convert.Convert call will succeed and convert the null to the requested type (null is a valid value of any type, so it will always succeed) and then either the calling Go code could handle it, or it could let that null value propagate into the HCL EvalContext where HCL should be able to handle it and produce a suitable user-facing error saying that a null value isn't permitted.

The main thing here is that an hcl.EvalContext should never have a cty.NilVal in it. To represent something that exists but has no value, use cty.NullVal instead. cty.DynamicVal instead represents that something exists and its value/type are not yet known, which activates a different set of behaviors which I think are unnecessary in Packer because the final configuration values do not depend on external side-effects as they do in Terraform.


As some additional context, here's how Terraform deals with this problem:

By default, a Terraform variable is required and so Terraform would emit an error if it isn't set. An author can make a variable optional by providing a default value:

variable "example" {
  type    = string
  default = "Hello"
}

In some cases there isn't any reasonable default value, and the module simply wants to detect whether or not the value is set. In that case, the author can specify the default as null to say that the variable is optional but it has no default:

variable "example" {
  type    = string
  default = null
}

When preparing the object var to go in the EvalContext, Terraform looks to see if the caller of the module explicitly provided a value for the variable, and if not it uses the default instead. In the second example above where the default is null, evaluating that default expression produces cty.NullVal(cty.DynamicPseudoType) and then, due to the type constraint string, terraform calls convert.Convert(cty.NullVal(cty.DynamicPseudoType), cty.String) and ends up with cty.NullVal(cty.String).

If the module elsewhere contains an expression like "${var.example}-moo" then HCL will automatically detect that var.example is a cty null (as opposed to the zero value cty.NilVal) and produce a user-friendly error saying that null isn't valid for string concatenation, rather than crashing.

azr commented 4 years ago

Howdy @apparentlymart many thanks for explaining this; I superconfused myself reading the docs over and now reading your comment it's super clear ๐Ÿ‘ ๐Ÿ‘ ๐Ÿ‘ ๐Ÿ‘ , I think this can be closed.