better handling of sub-resources

n-oden commented 3 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

In Google Cloud's resource data model, some resources are effectively sub-resources of others and will be automatically deleted if the apex resource is deleted.

For example, a CloudSQL DB Instance (google_sql_database_instance) can have associated with it multiple CloudSQL Database (google_sql_database) and CloudSQL User (google_sql_user) resources, and terraform can be used to create all of them. But If a configuration change requires destroying or destroy/recreating the DB Instance, terraform will attempt to manually delete all of the Database and User instances associated with it before deleting the instance itself. In addition to making what is already a somewhat slow API operation much slower, this approach makes it much more fragile: deleting a google_sql_database may fail if a sql client is presently executing a transaction against that database but if our intention is to remove the db instance itself...we don't care!

As another example, Google Kubernetes Engine clusters (google_container_cluster) may have multiple node pools (google_container_node_pool) associated with them: a change that will delete the original cluster will see terraform attempt to delete each node pool individually (an operation that can take north of 10 minutes) before destroying the cluster resource, all of which is wasted time: the projects.locations.cluster.delete API will delete the cluster and all of its node pools in a single shot.

In an ideal world, a destructive change to an apex resource in this sort of scenario would not spend time manually deleting the sub-resources: if for example I were to delete a google_container_cluster, the provider should call project.locations.cluster.delete and, upon success of that call, remove any associated google_container_node_pool resources from terraform state.

New or Affected Resource(s)

google_container_cluster
google_container_node_pool
google_sql_database_instance
google_sql_database
google_sql_user

(And probably many more; these are just two pertinent examples.)

Potential Terraform Configuration

n/a: this should happen automatically with the present configuration

References

See discussion in the hashicorp forums here -- it would be good if we could in some way manually specify this sort of relationship across providers (e.g. between google_container_cluster and helm_resource or kubernetes_secret resources) as well, but obviously that's a higher-order question for terraform itself.

danawillow commented 3 years ago

Right now the way the terraform execution graph works is that when we're deleting the subresource, the provider has no way to know that the parent resource is also being deleted. Added the upstream-terraform label because this would require something from Terraform core or the SDK to actually expose the graph to the individual resources during execution. As things are today, there isn't anything we can do to solve this in the provider.

n-oden commented 3 years ago

@danawillow thanks for the update and explanation. I'll continue hassling hashicorp about this then. :)

bouk commented 3 years ago

Something kind of related to this: these subresources go into a broken state if the parent resource has been deleted (manually for example).

If I have a sql_database_instance with two databases and a user, and the database is deleted and I want to recreate it, then running terraform refresh gets me these errors:

Error: Error when reading or editing Database: googleapi: Error 400: Invalid request: Invalid request since instance is not running., invalid

Error: Error when reading or editing Database: googleapi: Error 400: Invalid request: Invalid request since instance is not running., invalid

Error: Error, failed to deleteuser default in instance sandbox-db-46dafe9f: googleapi: Error 400: Invalid request: Invalid request since instance is not running., invalid

It would be nice if the provider could understand that it will never be able to get the status of these resources, since the instance has been deleted.

hashicorp / terraform-provider-google