Open mogul opened 3 years ago
I think you're talking about this github action: https://github.com/GSA/datagov-ssb/runs/2224255533?check_suite_focus=true
It doesn't look like that it contains changes in this terraform plan (we don't even see the switch between true and false for fail_when_catalog_not_accessible
), are you sure you don't have another issue here ?
The GitHub Action audit sequence is hard to understand since I later had to nuke the tfstate to get out of the problem situation. Then I rebased and force-pushed the branch. These are three commits kept during the rebase.
If you look at the set of commit
runs you get a better sense of the commits prior to the rebase. The series of plan
runs gives you a sense of where we were ran into trouble.
plan
to error out was right after we pushed an empty commit, just to get GitHub Actions to repopulate some things that had been removed manually.fail_when_catalog_not_accessible
to false. Even after that we were still encountering the problem during plan
. At that point we pored over your PR on the provider to try to figure out how failNotAccessible
was still ending up true
on line246 and couldn't figure it out. We finally gave up and put in this issue.
After that I started working with the environment from my local machine. I went through a bunch of different attempts to try to fix the problem: Tainting the broker in question, running plan
with -target
, then applying the plan output, etc. I just couldn't get the apply to work because no matter what I did the provider still ended up trying to query the catalog. If I told the apply not to refresh, then it complained about a mismatch with the plan for the obsolete ssb-new-weevil.app.cloud.gov
route/catalog, which I just couldn't get it to forget about.
Then I ended up trying to edit the .tfstate to manually remove the Solr broker (no easy feat when the path for finding the tfstate for that workspace in S3 was obscure). Pretty soon I'd trashed the state and finally just deleted that workspace state, manually deleted the resources, and ran apply again. This was OK to do because the workspace referred to a staging environment, but I'm nervous about the potential to have to do this for production too if we end up in this state in the future...! 😓
After a bunch of other wrangling, I'm at the point where both the change of fail_when_catalog_not_accessible
to false
and the change to non-random routes are hitting our production space.
After most of the apply was done, including the replacement of the routes, we saw the apply fail before completion:
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_eks.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/prod"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-smart-garfish.app.cloud.gov"), but now
cty.StringVal("https://ssb-eks-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_eks.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/development"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-smart-garfish.app.cloud.gov"), but now
cty.StringVal("https://ssb-eks-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_eks.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/management"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-smart-garfish.app.cloud.gov"), but now
cty.StringVal("https://ssb-eks-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_aws.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/prod"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-intimate-mink.app.cloud.gov"), but now
cty.StringVal("https://ssb-aws-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_eks.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/staging"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-smart-garfish.app.cloud.gov"), but now
cty.StringVal("https://ssb-eks-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_aws.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/staging"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-intimate-mink.app.cloud.gov"), but now
cty.StringVal("https://ssb-aws-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_aws.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/management"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-intimate-mink.app.cloud.gov"), but now
cty.StringVal("https://ssb-aws-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Error: Provider produced inconsistent final plan
When expanding the plan for
module.broker_aws.cloudfoundry_service_broker.space_scoped_broker["gsa-datagov/development"]
to include new values learned so far during apply, provider
"registry.terraform.io/cloudfoundry-community/cloudfoundry" produced an
invalid new value for .url: was
cty.StringVal("https://ssb-intimate-mink.app.cloud.gov"), but now
cty.StringVal("https://ssb-aws-gsa-datagov-management.app.cloud.gov").
This is a bug in the provider, which should be reported in the provider's own
issue tracker.
Then the very next plan
failed, again with catalog 404s causing the failure:
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-intimate-mink.app.cloud.gov') does not exist.
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-intimate-mink.app.cloud.gov') does not exist.
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-intimate-mink.app.cloud.gov') does not exist.
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-intimate-mink.app.cloud.gov') does not exist.
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-smart-garfish.app.cloud.gov') does not exist.
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-smart-garfish.app.cloud.gov') does not exist.
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-smart-garfish.app.cloud.gov') does not exist.
Error: Error when getting catalog signature: Status code: 404 Not Found, Body: 404 Not Found: Requested route ('ssb-smart-garfish.app.cloud.gov') does not exist.
It seems that the first problem (of using the old routes rather than the new in the apply) is resulting in the TF state continuing to refer to the old routes for the brokers. The old routes were definitely removed during the apply, yet definitely still appear in the service-broker registrations:
% cf routes
Getting routes for org gsa-datagov / space management as [redacted]...
space host domain port path protocol apps
management ssb-aws-gsa-datagov-management app.cloud.gov http ssb-aws
management ssb-solr-gsa-datagov-management app.cloud.gov http ssb-solr
management ssb-eks-gsa-datagov-management app.cloud.gov http ssb-eks
% cf service-brokers
Getting service brokers as [redacted]...
name url
ssb-ssb-aws-gsa-datagov-staging https://ssb-intimate-mink.app.cloud.gov
ssb-ssb-aws-gsa-datagov-prod https://ssb-intimate-mink.app.cloud.gov
ssb-ssb-aws-gsa-datagov-development https://ssb-intimate-mink.app.cloud.gov
ssb-ssb-aws-gsa-datagov-management https://ssb-intimate-mink.app.cloud.gov
ssb-ssb-eks-gsa-datagov-staging https://ssb-smart-garfish.app.cloud.gov
ssb-ssb-eks-gsa-datagov-prod https://ssb-smart-garfish.app.cloud.gov
ssb-ssb-eks-gsa-datagov-management https://ssb-smart-garfish.app.cloud.gov
ssb-ssb-eks-gsa-datagov-development https://ssb-smart-garfish.app.cloud.gov
ssb-solr-gsa-datagov-development https://ssb-improved-bunny.app.cloud.gov
ssb-solr-gsa-datagov-staging https://ssb-improved-bunny.app.cloud.gov
ssb-solr-gsa-datagov-management https://ssb-improved-bunny.app.cloud.gov
ssb-solr-gsa-datagov-prod https://ssb-improved-bunny.app.cloud.gov
And again: I expected that having fail_when_catalog_not_accessible
value being set to false
would prevent the failure upon encountering the 404s (in which case everything could probably still recover at the next apply), but that's clearly not happening.
I'm leaving it in this state and am very open to debugging it interactively with you via a Slack call when our timezones overlap! (I'm @mogul
in the Cloud Foundry Slack, and I'm in UTC-7.)
For the state you could also simply change fail_when_catalog_not_accessible
to set false inside it.
I understand your frustration so I went deeper, it looks like I can't access to changes during read when I made the change I clearly can, maybe my version of terraform was allowing than.
I've tried many way to get this information during read and it's impossible from what cli give to the provider.
So, I've another proposal, I would like to add in provider config this value: force_broker_not_fail_when_catalog_not_accessible
associated to env var CF_FORCE_BROKER_NOT_FAIL_CATALOG
and when set to true
this will enforce fail_when_catalog_not_accessible
to be false
.
I've tried this configuration and it's work. What do you think about it ?
please see the pull request and give it a try
Confirmed working in the PR.
When an app, previously registered as a service broker, is deleted outside of terraform, it's possible to get into an unrecoverable state where when you run
terraform plan
you see:This error is encountered even when you set
fail_when_catalog_not_accessible
tofalse
. This was surprising since this issue was supposed to be resolved by the PR referenced here : PR https://github.com/cloudfoundry-community/terraform-provider-cloudfoundry/pull/300Confirmed as still happening with the latest provider version, 0.14.0.