GoogleCloudPlatform / terraform-google-enterprise-application

Deploy an enterprise developer platform on Google Cloud
https://registry.terraform.io/modules/GoogleCloudPlatform/enterprise-application/google
Apache License 2.0
27 stars 14 forks source link

feat(module): ensure there are no external ips in the cluster nodes #212

Closed caetano-colin closed 1 month ago

caetano-colin commented 1 month ago
apeabody commented 1 month ago

/gcbrun

apeabody commented 1 month ago

Hey @caetano-colin! - You will also need to include master_authorized_networks, I'm wondering if it will accept an empty list? master_authorized_networks = []? Otherwise we'll need to select a network.

caetano-colin commented 1 month ago

Hey @caetano-colin! - You will also need to include master_authorized_networks, I'm wondering if it will accept an empty list? master_authorized_networks = []? Otherwise we'll need to select a network.

The default value is an empty list, these are the transformations the module does with the variable

master_authorized_networks_config = length(var.master_authorized_networks) == 0 ? [] : [{
    cidr_blocks : var.master_authorized_networks
  }]

https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/master/modules/beta-private-cluster/cluster.tf#L213

If think that because it is using a terraform dynamic block with an empty list, it is not specifying an empty master_authorized_networks_config block like this:

master_authorized_networks_config = {}

Instead, it does not specify any block at all

caetano-colin commented 1 month ago

as of commit 6e4b73dc272ce7309f2d4af2441f4ea214d02ab6 we removed the private cluster endpoint flag and tests

This means this PR now only contains tests/flags for ensuring private cluster nodes (i.e. nodes without external ip's)

apeabody commented 1 month ago

/gcbrun

apeabody commented 1 month ago

/gcbrun

apeabody commented 1 month ago

/gcbrun

apeabody commented 1 month ago
            Error:          Received unexpected error:
                            FatalError{Underlying: error while running command: exit status 1; 
                            Error: Error waiting for creating GKE cluster: Conflicting IP cidr range: Invalid IPCidrRange: 10.0.0.0/28 conflicts with existing subnetwork 'gke-cluster-us-central1-production-35d4d8fa-pe-subnet' in region 'us-central1'.

                              with module.env.module.gke-standard["eab-production-region02"].google_container_cluster.primary,
                              on .terraform/modules/env.gke-standard/modules/beta-private-cluster/cluster.tf line 22, in resource "google_container_cluster" "primary":
                              22: resource "google_container_cluster" "primary" {
                            }
            Test:           TestMultitenant/production
caetano-colin commented 1 month ago
          Error:          Received unexpected error:
                          FatalError{Underlying: error while running command: exit status 1; 
                          Error: Error waiting for creating GKE cluster: Conflicting IP cidr range: Invalid IPCidrRange: 10.0.0.0/28 conflicts with existing subnetwork 'gke-cluster-us-central1-production-35d4d8fa-pe-subnet' in region 'us-central1'.

                            with module.env.module.gke-standard["eab-production-region02"].google_container_cluster.primary,
                            on .terraform/modules/env.gke-standard/modules/beta-private-cluster/cluster.tf line 22, in resource "google_container_cluster" "primary":
                            22: resource "google_container_cluster" "primary" {
                          }
          Test:           TestMultitenant/production

for the record, this is happening because there are more than one cluster being created and each of them is being created with a hardcoded CIDR (https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/master/modules/beta-private-cluster/variables.tf#L472), so when any other cluster is created after the first, we get an overlap

seeing if commit d03efa4 will fix it

caetano-colin commented 1 month ago

Node pools are being created in an error state, at first I thought it might have been a firewall issue, as it was on my local tests.

is it possible we get a export TF_LOG=DEBUG or output of logs on the project to get more insight in why the node pool is on the error state?

apeabody commented 1 month ago

Node pools are being created in an error state, at first I thought it might have been a firewall issue, as it was on my local tests.

is it possible we get a export TF_LOG=DEBUG or output of logs on the project to get more insight in why the node pool is on the error state?

Hi @caetano-colin - I triggered a fresh run so hopefully I can check the actual project.

apeabody commented 1 month ago

Node pools are being created in an error state, at first I thought it might have been a firewall issue, as it was on my local tests. is it possible we get a export TF_LOG=DEBUG or output of logs on the project to get more insight in why the node pool is on the error state?

Hi @caetano-colin - I triggered a fresh run so hopefully I can check the actual project.

Looking like a networking problem, the node VMs appear to be unreachable.

caetano-colin commented 1 month ago

/gcbrun

caetano-colin commented 1 month ago

Thanks! I'll implement these improvements