hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.28k stars 1.72k forks source link

Failing test(s): TestAccDataLossPreventionDiscoveryConfig_Update #18004

Closed SarahFrench closed 4 weeks ago

SarahFrench commented 4 months ago

Impacted tests

Affected Resource(s)

Failure rates

Message(s)

    vcr_utils.go:152: Step 1/4 error: Error running apply: exit status 1
        Error: Error creating DiscoveryConfig: googleapi: Error 400: All discovery configs must be in the same location, but one already exists in [us]
          with google_data_loss_prevention_discovery_config.basic,
          on terraform_plugin_test.tf line 14, in resource "google_data_loss_prevention_discovery_config" "basic":
          14: resource "google_data_loss_prevention_discovery_config" "basic" {

Nightly build test history

b/346850977

SarahFrench commented 4 months ago

Looks like it could be an issue with sweeping up dangling resources, but would need to double-check the test's config and debug logs from the test

patrickmoy commented 4 months ago

This could also possibly be due to tests running in parallel, though the update tests in resource_data_loss_prevention_discovery_config_test.go are set to run serially to prevent this from happening...

melinath commented 3 months ago

There are a number of other DLP tests - do they also need to be forced to run in serial maybe?

patrickmoy commented 3 months ago

No, our service restriction of one config per region per storage type only applies to DiscoveryConfig resources, so this is new.

I believe it's the example tests running simultaneously with the update tests, I found some similar merged PRs that required serial tests, but some opt to use "skip_test" on their examples. I could take a stab at that and see if it resolves.

patrickmoy commented 2 months ago

Can we confirm that there is still flakiness? https://github.com/GoogleCloudPlatform/magic-modules/pull/10798 should have addressed this to some degree

melinath commented 2 months ago

It's still definitely flakey - hard to say yet if it's better than it was previously or not.

patrickmoy commented 2 months ago

Darn; thought that would be enough. I'll revisit with more extensive fixes

melinath commented 2 months ago

hmm... it looks like the failure mode has changed, though. The tests are now failing with diffs on errors.details subfields. for example:

          map[string]string{
        +   "errors.#":                   "1",
        +   "errors.0.%":                 "2",
        +   "errors.0.details.#":         "1",
        +   "errors.0.details.0.%":       "3",
        +   "errors.0.details.0.code":    "7",
        +   "errors.0.details.0.message": `None of the driver projects (ci-test-project-nightly-ga) have "cloudasset.assets.exportResource" permission for organizations/12345678. See https://cloud.google.com/dlp/docs/grant-data-profiling-access.`,
        +   "errors.0.timestamp":         "",
          }

This is just an example; different failures have different errors. I'm not sure whether these should be resolved or ignored.

patrickmoy commented 2 months ago

Ah, this is similar to https://github.com/hashicorp/terraform-provider-google/issues/18102.

The cause here is that once the resource is created, the DLP service immediately begins to run. Usually the test finishes before it can finish a scan and deletes the resource, but if DLP scans quickly enough, we'll save errors/timestamps in the resource for the customer to view.

Either create the resource in PAUSED mode, or ignore verification on all errors/timestamp attributes.

I'll have a PR fix up shortly.

patrickmoy commented 2 months ago

Hopefully fixed by https://github.com/GoogleCloudPlatform/magic-modules/pull/10959 — let me know if there are still any errors

SarahFrench commented 2 months ago

The test has passed over the past 3 days but I'd like to wait another day or two to check the flakiness is definitely gone:

Screenshot 2024-06-17 at 12 19 46

melinath commented 4 weeks ago

This did resolve but has started failing @ 100% again - I'll open as a new ticket.