hashicorp / terraform-plugin-sdk

Terraform Plugin SDK enables building plugins (providers) to manage any service providers or custom in-house solutions
https://developer.hashicorp.com/terraform/plugin
Mozilla Public License 2.0
425 stars 230 forks source link

v0.12 regressions: StateFunc is ignored, resource always refreshes #164

Open invidian opened 5 years ago

invidian commented 5 years ago

Terraform Version

Terraform v0.12.2
+ provider.statefunc (unversioned)

Terraform Configuration Files

provider "statefunc" {}

resource "statefunc_test" "foo" {
  input = "foo"
}

output "foo" {
  value = "${statefunc_test.foo.result}"
}

Expected Behavior

Terraform resource is created once and never refreshes.

Actual Behavior

Resource refreshes on every Terraform run.

Steps to Reproduce

  1. Build testing provider from the code below
  2. Take Terraform manifest from snippet above
  3. Run Terraform twice.

Reproducible provider code:

package main

import (
    "crypto/sha256"
    "fmt"

    "github.com/hashicorp/terraform/helper/schema"
    "github.com/hashicorp/terraform/plugin"
    "github.com/hashicorp/terraform/terraform"
)

func main() {
    plugin.Serve(&plugin.ServeOpts{
        ProviderFunc: Provider})
}

func Provider() terraform.ResourceProvider {
    return &schema.Provider{
        ResourcesMap: map[string]*schema.Resource{
            "statefunc_test": resourceTest(),
        },
    }
}

func resourceTest() *schema.Resource {
    return &schema.Resource{
        Create: resourceTestCreate,
        Read:   resourceTestRead,
        Delete: resourceTestDelete,

        Schema: map[string]*schema.Schema{
            "input": &schema.Schema{
                Type:      schema.TypeString,
                Required:  true,
                ForceNew:  true,
                StateFunc: sha256sum,
            },
            "result": &schema.Schema{
                Type:      schema.TypeString,
                Computed:  true,
                ForceNew:  true,
                StateFunc: sha256sum,
            },
        },
    }
}

func resourceTestCreate(d *schema.ResourceData, m interface{}) error {
    result := sha256sum(d.Get("input"))
    d.Set("result", d.Get("input"))
    d.SetId(result)

    return resourceTestRead(d, m)
}

func resourceTestRead(d *schema.ResourceData, m interface{}) error {
    return nil
}

func resourceTestDelete(d *schema.ResourceData, m interface{}) error {
    return nil
}

func sha256sum(data interface{}) string {
    return fmt.Sprintf("%x", sha256.Sum256([]byte(data.(string))))
}

Additional Context

It seems that here is a regression in v0.12.x regarding applying SchemeStateFunc. Following examples are compatible with Terraform v0.11.14 and with this version, problem does not occur and StateFunc works as expected.

What the problem here actually is, that despite having StateFunc defined, and computed value ends up in state file, without StateFunc being applied, as described in hashicorp/terraform-plugin-sdk#163, but this issue is specifically about resources being always refreshed, which is very annoying.

If there is some workaround, which can be implemented on provider side, it would be great.

Here are 2 Terraform state files for additional context:

{
  "version": 4,
  "terraform_version": "0.12.2",
  "serial": 13,
  "lineage": "0aecbf07-f3b0-1d56-dae8-c65165388c0c",
  "outputs": {
    "foo": {
      "value": "foo",
      "type": "string"
    }
  },
  "resources": [
    {
      "mode": "managed",
      "type": "statefunc_test",
      "name": "foo",
      "provider": "provider.statefunc",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "id": "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae",
            "input": "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae",
            "result": "foo"
          },
          "private": "bnVsbA=="
        }
      ]
    }
  ]
}
{
    "version": 3,
    "terraform_version": "0.11.14",
    "serial": 1,
    "lineage": "0b90fab3-40c3-88ff-abbe-3d145632a59b",
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {
                "foo": {
                    "sensitive": false,
                    "type": "string",
                    "value": "foo"
                }
            },
            "resources": {
                "statefunc_test.foo": {
                    "type": "statefunc_test",
                    "depends_on": [],
                    "primary": {
                        "id": "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae",
                        "attributes": {
                            "id": "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae",
                            "input": "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae",
                            "result": "foo"
                        },
                        "meta": {},
                        "tainted": false
                    },
                    "deposed": [],
                    "provider": "provider.statefunc"
                }
            },
            "depends_on": []
        }
    ]
}

References

Might be a duplicate of hashicorp/terraform-plugin-sdk#163, though it addresses slightly different issue.

apparentlymart commented 5 years ago

Hi @invidian!

Can you clarify what you mean by "refreshes" here? We expect all existing resources in the state to be refreshed for each new Terraform run, but I suspect you may mean something different by that word than I do.

invidian commented 5 years ago

Hi @apparentlymart

I'm sorry I didn't make it clear in the first place. Here is the terraform apply output on subsequent run, which is the issue I'm having:

$ terraform version
Terraform v0.11.14
+ provider.statefunc (unversioned)

Your version of Terraform is out of date! The latest version
is 0.12.2. You can update by downloading from www.terraform.io/downloads.html

$ terraform apply
statefunc_test.foo: Refreshing state... (ID: 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae)

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

foo = foo
$ /usr/bin/terraform version
Terraform v0.12.2
+ provider.statefunc (unversioned)

$ /usr/bin/terraform apply
statefunc_test.foo: Refreshing state... [id=2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # statefunc_test.foo must be replaced
-/+ resource "statefunc_test" "foo" {
      ~ id     = "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae" -> (known after apply)
        input  = "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae"
      ~ result = "foo" -> (known after apply) # forces replacement
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

statefunc_test.foo: Destroying... [id=2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae]
statefunc_test.foo: Destruction complete after 0s
statefunc_test.foo: Creating...
statefunc_test.foo: Creation complete after 0s [id=2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae]

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.

Outputs:

foo = foo

Basically on every terraform apply run, resource is being recreated.

apparentlymart commented 5 years ago

Thanks for that extra context, @invidian!

It looks like something strange is happening in the SDK code when Terraform CLI asks the plugin to produce a plan... for some reason the plan has the result attribute set to an unknown value rather than either retaining the prior value of "foo".

StateFunc is usually used for attributes that come from configuration rather than those that are computed by the provider; if the provider is generating it anyway then it might as well just write the SHA256 hash directly into the attribute rather than using a StateFunc to do it. We didn't intentionally stop that working in 0.12.0, but I think this is a case where we didn't know to even test it because this wasn't a usage pattern we anticipated. Would it be possible for you to just write the transformed result directly into the attribute value and not use a StateFunc at all, if you plan to fill a Computed-only attribute here?

I'm not sure if will be feasible to restore the prior behavior here for Computed attributes in particular, because I suspect it may be an unfortunate interaction with the Terraform 0.12 SDK compatibility layer (which translates the new protocol to the old SDK) where it doesn't have enough information to understand how to handle this. However, I will label it so that the team that maintains the SDK can find it and consider it.


Also worth noting that although we've preserved the StateFunc behavior for now for compatibility (though apparently not 100% compatibility :confounded:), I expect it will be phased out in a future SDK release, possibly in conjunction with ending support for Terraform 0.11.

Its original purpose was to avoid including multi-line strings in the terse Terraform plan output in prior versions, but the Terraform 0.12 plan output is designed to render multi-line strings intelligibly and so it's often better to just include the value as-is and let the user see it properly in the rendered diff.

When used with an attribute that is set in configuration, it also causes surprising behavior for users because the value they typed literally in configuration is not what they see in plan or when referring to the attribute in expressions elsewhere.

There are no specific settled plans to remove it or any specific deadline for doing so, but if you're not already using it I would suggest not adding new uses of it unless it's totally unavoidable, lest they need to be backed out again later.

invidian commented 5 years ago

Thanks for your explanation @apparentlymart, I really like reading your Terraform insights.

To put more context into my use case of StateFunc. My provider, where I have the actual issue, generates ASCII-armored GPG message, which is stored in a result computed field. The result is then can be passed to another resource as content, like object store or local file. This means, that I need the actual value during runtime, but since the result value is reproducable, I'm able to calculate it on every Terraform run. Since those messages can be lengthy, if user has many of them defined, they may significantly increase the size of the state file, hence the hash function, since I only care when it changes. I don't know if size of the state file matters, but since there is an option to reduce it, why not use it.

I could actually even discard it completely from the state with StateFunc like return "". Maybe for my use case, the data source would be a better choice?

apparentlymart commented 5 years ago

Hi @invidian! Thanks for sharing that additional context.

A problem with that scenario is that the value returned from a reference to an attribute is the value stored in the state, and so it too would see the hashed value and the clear GPG message would not be available for evaluation at all. There isn't currently any means to distinguish between the value used for expression evaluation and the expression stored in the state, and Terraform consistently uses only the state for evaluation because that ensures consistency between different circumstances where evaluation can happen.

For example, consider terraform console where the user can type in any arbitrary expression and expects to see the same result that would be returned during a normal Terraform operation. In that case Terraform is just reading the values directly from the state and the provider logic does not run at all, aside from an early call into the provider to fetch the resource type schemas.

xanderflood commented 4 years ago

I'm also running into this issue. I'm building a proprietary provider and one of our resources has a large text blob field that it typically going to be loaded from a file. To conserve resources, I added a StateFunc to the field so that we would store an MD5 hash instead, but whenever I run a plan, Terraform detects that the massive text blob differs from the stored MD5 hash, and identifies a diff.

Is there currently any workaround for this? In my case the field in question needs to ForceNew, so this is a pretty significant problem for us.

xanderflood commented 4 years ago

@apparentlymart for what it's worth, I just want to express my interest in y'all continuing support for StateFunc. Even if multi-line rendering is fixed, it's really useful for situations like ours where we're trying to use Terraform with an API intended for managing binary assets.

xanderflood commented 4 years ago

Just an update while I dig for a workaround here.

  1. I didn't notice previously that the StateFunc doesn't seem to be affecting the statefile at all - my statefile actually contains my entire base64-encoded binary even though I specified a StateFunc for that field to compute an MD5.
  2. When I define a CustomizeDiff callback on the resource, it receives both old and new values as raw unhashed data
  3. When I define a SuppressDiffFunc callback on the field, it receives an unhashed old value (straight out of the statefile) but a hashed new value (which apparently was passed successfully through the StateFunc)

Here's my workaround for anybody running into the same issue:

  1. Add your StateFunc to the field
  2. Add a DiffSuppressFunc to the same field. In it, apply the StateFunc to the old value and then return old == new to get an accurate diff

One issue with this workaround is that, when the underlying bug is eventually fixed, this fix will break. Make sure you have a test around it so that you'll catch that and can remove the patch when it's no longer needed.

xanderflood commented 4 years ago

Although I guess a better workaround might be to not use StateFunc at all, since it doesn't currently have any impact on what's stored in your statefile 🤷‍♀

ghost commented 3 years ago

I too seem to seeing that the StateFunc has no effect at all. I'm trying to use it to prevent secrets from being stored in state. To compound the issue, the value I'm working with is a map(string).

ATTR_DATA: {
    Type:        schema.TypeMap,
    Elem: &schema.Schema{
        Type: schema.TypeString,
        StateFunc: func(i interface{}) string {
            return ""
        },
    },
    Required: true,
    ForceNew: true,
    Sensitive:   true,
},
"attributes": {
    "annotations": null,
    "data": {
        "foo": "bar"
    },

I'm not sure why this doesn't work