hashicorp / terraform-plugin-testing

Module for testing Terraform providers
Mozilla Public License 2.0
45 stars 11 forks source link

Allow easier testing of resource's disappearance #73

Open radeksimko opened 4 years ago

radeksimko commented 4 years ago

Problem statement

It is a convention for Terraform to not error out when a resource was deleted out of bound (e.g. by hand or another tooling outside of Terraform) and instead wipe it out of the state and automatically plan its recreation.

For this to work though each provider has to implement such behaviour correctly for each resource and this is not trivial nor easy to implement today.

The common pattern to achieve this is:

  1. Add a condition to Read which calls d.SetId("") (i.e. remove resource from state) if resource is not found via API (e.g. when API returns 404 error) and issue WARN log message in such case. This could/should be a user-visible warning, but that requires https://github.com/hashicorp/terraform-plugin-sdk/issues/230
  2. Add acceptance test which deletes the resource via API - without calling Delete explicitly, to simulate the failure mode and has ExpectNonEmptyPlan: true. Any acceptance tests generally expects all operations to not error.

Example

https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_cloudwatch_log_stream.go#L73-L77

    if !exists {
        log.Printf("[DEBUG] CloudWatch Stream %q Not Found. Removing from state", d.Id())
        d.SetId("")
        return nil
    }

https://github.com/terraform-providers/terraform-provider-aws/blob/f39af7fff691603f5dc5b553c618d1dbdb6ea0df/aws/resource_aws_cloudwatch_log_stream_test.go#L33-L52

func TestAccAWSCloudWatchLogStream_disappears(t *testing.T) {
    var ls cloudwatchlogs.LogStream
    rName := acctest.RandString(15)

    resource.ParallelTest(t, resource.TestCase{
        PreCheck:     func() { testAccPreCheck(t) },
        Providers:    testAccProviders,
        CheckDestroy: testAccCheckAWSCloudWatchLogStreamDestroy,
        Steps: []resource.TestStep{
            {
                Config: testAccAWSCloudWatchLogStreamConfig(rName),
                Check: resource.ComposeTestCheckFunc(
                    testAccCheckCloudWatchLogStreamExists("aws_cloudwatch_log_stream.foobar", &ls),
                    testAccCheckCloudWatchLogStreamDisappears(&ls, rName),
                ),
                ExpectNonEmptyPlan: true,
            },
        },
    })
}

Proposal

Exact implementation is TBD, but it would be great to come up with something as simple as a flag for resource.TestStep which automatically tests these scenarios, so that more provider developers get into the habit of testing this more regularly and we really make this a widely accepted convention in providers.

kmoe commented 4 years ago

Additional context by @radeksimko (from duplicate issue hashicorp/terraform-plugin-sdk#9):

Testing whether Read or Exists implementations are correctly removing resource from state, when it is gone currently requires significant number of LOC and leads to lack of such tests, which in turn then leads to bugs like this https://github.com/hashicorp/terraform/issues/18931

The goal is to explore options for making such tests easier to write and generally making it easier to test such scenarios.


From @bflad:

This would save quite a few bug reports!

Would it make sense if this was a new boolean on TestStep? e.g. ExpectsRecreationIfDeleted (fixing my poor naming choice). It would automatically call the resource Delete function then ideally something similar to ExpectsNonEmptyPlan, but expecting a new resource in the difference rather than just any plan difference.

There are actually reasons to go beyond just a boolean here and instead accept a resource name from the Terraform state. Here's a great example: https://github.com/terraform-providers/terraform-provider-aws/issues/5914

Basically, ELBs can have children load balancer policies, which are wholly dependent on the ELB parent existing. In this case, for this acceptance testing I will actually need two separate TestStep's to verify the resource correctly catches both of the errors returned by the API:

  • One for removal of the load balancer policy (PolicyNotFound)
  • One for removal of the ELB (LoadBalancerNotFound)
bflad commented 4 years ago

Just to provide a current status update from the AWS Provider on this one, we implemented a helper function awhile back to at least make this generic:

func testAccCheckResourceDisappears(provider *schema.Provider, resource *schema.Resource, resourceName string) resource.TestCheckFunc {
    return func(s *terraform.State) error {
        resourceState, ok := s.RootModule().Resources[resourceName]

        if !ok {
            return fmt.Errorf("resource not found: %s", resourceName)
        }

        if resourceState.Primary.ID == "" {
            return fmt.Errorf("resource ID missing: %s", resourceName)
        }

        return resource.Delete(resource.Data(resourceState.Primary), provider.Meta())
    }
}

This allows us to implement tests like the following:

func TestAccAWSAppsyncGraphqlApi_disappears(t *testing.T) {
    var api1 appsync.GraphqlApi
    rName := acctest.RandomWithPrefix("tf-acc-test")
    resourceName := "aws_appsync_graphql_api.test"

    resource.ParallelTest(t, resource.TestCase{
        PreCheck:     func() { testAccPreCheck(t); testAccPreCheckAWSAppSync(t) },
        Providers:    testAccProviders,
        CheckDestroy: testAccCheckAwsAppsyncGraphqlApiDestroy,
        Steps: []resource.TestStep{
            {
                Config: testAccAppsyncGraphqlApiConfig_AuthenticationType(rName, "API_KEY"),
                Check: resource.ComposeTestCheckFunc(
                    testAccCheckAwsAppsyncGraphqlApiExists(resourceName, &api1),
                    testAccCheckResourceDisappears(testAccProvider, resourceAwsAppsyncGraphqlApi(), resourceName),
                ),
                ExpectNonEmptyPlan: true,
            },
        },
    })
}

Not the most ideal, but it saves us a lot of extra work for new resources.

bflad commented 1 year ago

Since its being discussed over in https://github.com/hashicorp/terraform-plugin-framework/issues/617 --

To throw a proposal out there involving no coupling to a specific SDK, one could imagine having a test step which handles the equivalent of calling terraform destroy -target=examplecloud_thing.test, but also offering the option to preserve the prior state. This means Terraform will call the provider code's "Delete" functionality without needing the details of the underlying provider implementation/SDK. From a usage perspective, it could look something like:

resource.TestCase{
  // ... other fields
  Steps: []resource.TestStep{
    // Create our resource and state for disappearance testing.
    {
      Config: `resource "examplecloud_thing" "test" {}`,
    },
    // Design sketch -- the example Targets and PreservePriorState fields do not exist today.
    // Targets would allow a complex configuration to test only the desired resource.
    // PreservePriorState would keep the existing Terraform state from the prior step.
    {
      Config: `resource "examplecloud_thing" "test" {}`,
      Destroy: true,
      Targets: []string{"examplecloud_thing.test"},
      PreservePriorState: true,
    },
    // Verify refresh behavior for now-missing resource.
    {
      Config: `resource "examplecloud_thing" "test" {}`,
      PlanOnly: true, // or RefreshState: true
    },
  },
}
bflad commented 1 year ago

With the introduction of the terraform-plugin-testing module (migration guide), we are transferring feature requests relating to the helper/resource "testing framework" in the terraform-plugin-sdk repository over to the new terraform-plugin-testing repository.