hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.36k stars 1.75k forks source link

Allow Flow Logs activation on private_cluster_config subnet #19804

Open ffrevol opened 1 month ago

ffrevol commented 1 month ago

Community Note

Description

Our organization policy required Audit logs activated for all subnets.

There is no such option for the subnet created when enabling private_cluster_config. There is no mechanism to retrieve the subnet id created to enabled those Flow Logs as well.

New or Affected Resource(s)

Potential Terraform Configuration

resource "google_container_cluster" "prototype" {
...
private_cluster_config {
    master_ipv4_cidr_block  = "10.0.16.0/28"
    enable_private_endpoint = true
    enable_private_nodes    = true
    master_global_access_config {
      enabled = true
    }
}
...

References

No response

b/373407382

juangascon commented 3 weeks ago

Hello. I have the same security constraint in my organization and I am facing the same issue. To have the VPC Logs activated, with Terraform, we need to do it at the subnetwork creation. It is very complicated to modify a subnetwork out of Terraform scope. So, we have to create the control plane subnetwork at the cluster creation. I have not found out yet how to resolve this because of this error:

β”‚ Error: Provider produced inconsistent final plan
β”‚ 
β”‚ When expanding the plan for module.gke.google_container_cluster.prototype to include new values learned so far during apply, provider "registry.terraform.io/hashicorp/google"
β”‚ produced an invalid new value for .private_cluster_config[0].enable_private_endpoint: was null, but now cty.False.
β”‚ 
β”‚ This is a bug in the provider, which should be reported in the provider's own issue tracker.

This happens in both the 4.55 and 5.44.2 versions of the provider. Maybe I should open a new issue ?

We have two ways of giving the CIDR range for the control plane endpoint:

  1. master_ipv4_cidr_block
  2. private_endpoint_subnetwork

Reading the documentation from GCP "Create a cluster and select the control plane IP address range" it is said:

So, the only way to get the VPC Logs is to use private_endpoint_subnetwork because with master_ipv4_cidr_block we let GCP create a subnetwork out of our Terraform configuration. I have not tried but I am pretty sure that it is really a harsh, complicated to import the resource, modify it, push the config with the enormous risk of messing all.

Though, if I do not use master_ipv4_cidr_block but create a subnet and put its name as value for the private_endpoint_subnetwork parameter I get the following error:

β”‚ Error: Provider produced inconsistent final plan
β”‚ 
β”‚ When expanding the plan for module.gke.google_container_cluster.prototype to include new values learned so far during apply, provider "registry.terraform.io/hashicorp/google"
β”‚ produced an invalid new value for .private_cluster_config[0].enable_private_endpoint: was null, but now cty.False.
β”‚ 
β”‚ This is a bug in the provider, which should be reported in the provider's own issue tracker.

The deployment goes perfectly if I declare the CIDR in master_ipv4_cidr_block instead of doing it in the private_endpoint_subnetwork. But then I have not the VPC Logs because GCP creates the control plane subnetwork out of Terraform scope.

My configuration is as follows:

resource "google_container_cluster" "prototype" {
...
  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = false
    private_endpoint_subnetwork = google_compute_subnetwork.cluster_control_plane.name
    master_global_access_config {
      enabled = false
    }
  }
...
}

resource "google_compute_subnetwork" "cluster_control_plane" {
  name                     = local.control_plane_private_endpoint_subnet_name
  region                   = var.region
  network                  = google_compute_network.prototype.name
  private_ip_google_access = true
  ip_cidr_range            = var.private_control_plane_subnetwork_ip_cidr_range

  stack_type                 = "IPV4_IPV6"
  private_ipv6_google_access = "ENABLE_OUTBOUND_VM_ACCESS_TO_GOOGLE"
  ipv6_access_type           = "INTERNAL"

  log_config {
    aggregation_interval = "INTERVAL_10_MIN"
    flow_sampling        = 0.5
    metadata             = "INCLUDE_ALL_METADATA"
  }
}
juangascon commented 2 weeks ago

Hello @ffrevol and @rileykarson

I have found a workaround to:

  1. activate the VPC Logs in subnetworks
  2. avoid the deployment from crashing

The solution that works for me (thanks Gemini Pro :+1: :clap: ) is:

  1. create a dedicated subnetwork with the CIDR that you want and that does not overlap any other CIDR in your VPC. This will allow to activate the VPC Logs in the cluster control plane subnetwork if you configure the log_config block as shown in the configuration in my previous comment.

  2. use the private_endpoint_subnetwork = <name_of_the_subnetwork> instead ofmaster_ipv4_cidr_block = as shown in my previous comment. Though, doing only this, the deployment will crash with the error shown above.

  3. then, the IMPORTANT change : enable_private_endpoint = null Explicitly set to null to avoid a bug of the provider that unexpectedly changes the attribute from null to false during the apply phase.

This should work but it is not the final solution because the value for a boolean parameter should be either true or false, not null. Though, there is still a bug in the provider that should be addressed. I shall raise an issue for that.

This is the explanation by Gemini Pro:

The error message indicates that the google_container_cluster resource's
private_cluster_config.enable_private_endpoint attribute is unexpectedly changing 
from null to false during the apply phase, even though it's not explicitly defined in your configuration.

This suggests that the provider might be implicitly setting enable_private_endpoint to false
when you provide a value for private_endpoint_subnetwork, even if enable_private_endpoint is not explicitly set.

You can try to explicitly set enable_private_endpoint to null in your configuration

Hope this helps with your configurations.

Take care.

juangascon commented 1 week ago

Hello. I have opened the issue #20429 to raise the bug in the provider that causes the error mentioned above.

Take care.