terraform-google-modules / terraform-google-kubernetes-engine

Configures opinionated GKE clusters
https://registry.terraform.io/modules/terraform-google-modules/kubernetes-engine/google
Apache License 2.0
1.15k stars 1.18k forks source link

feat(main): take `node_config_defaults` out of beta #2098

Closed wyardley closed 1 month ago

wyardley commented 2 months ago

node_config_defaults is GA in all our supported versions

Ran into some issues building it (see #2097), now resolved.

Edit: gcfs still is not (https://github.com/GoogleCloudPlatform/magic-modules/pull/11542), though, so I left that gated still for now, but put it around a smaller block so that it's possible to add other things inside this block (see #2082)

I know there was a bug with the empty node_config_defaults block somewhere, though, @apeabody

apeabody commented 2 months ago

/gcbrun

apeabody commented 2 months ago

Interesting

Step #98 - "verify private-zonal-with-networking":                          Diff:
Step #98 - "verify private-zonal-with-networking":                          --- Expected
Step #98 - "verify private-zonal-with-networking":                          +++ Actual
Step #98 - "verify private-zonal-with-networking":                          @@ -1,2 +1,2 @@
Step #98 - "verify private-zonal-with-networking":                          -(map[string]interface {}) (len=14) {
Step #98 - "verify private-zonal-with-networking":                          +(map[string]interface {}) (len=15) {
Step #98 - "verify private-zonal-with-networking":                            (string) (len=10) "diskSizeGb": (float64) 100,
Step #98 - "verify private-zonal-with-networking":                          @@ -6,2 +6,5 @@
Step #98 - "verify private-zonal-with-networking":                            (string) (len=9) "imageType": (string) (len=14) "COS_CONTAINERD",
Step #98 - "verify private-zonal-with-networking":                          + (string) (len=13) "kubeletConfig": (map[string]interface {}) (len=1) {
Step #98 - "verify private-zonal-with-networking":                          +  (string) (len=34) "insecureKubeletReadonlyPortEnabled": (bool) false
Step #98 - "verify private-zonal-with-networking":                          + },
Step #98 - "verify private-zonal-with-networking":                            (string) (len=6) "labels": (map[string]interface {}) (len=2) {
Step #98 - "verify private-zonal-with-networking":          Test:           TestPrivateZonalWithNetworking
wyardley commented 2 months ago

Interesting Step #98 - "verify private-zonal-with-networking": + (string) (len=34) "insecureKubeletReadonlyPortEnabled": (bool) false

@apeabody may be related: https://github.com/GoogleCloudPlatform/magic-modules/pull/11688 (except that is for node_config vs node_config_defaults, I think).

(which is potentially my fault originally 😅, and which also maybe ties into https://github.com/GoogleCloudPlatform/magic-modules/pull/11717 as well). I can take a look to see if node_config_defaults is also somehow affected by a similar issue.

If it can safely be made optional, that may help with this issue.

apeabody commented 2 months ago

Interesting Step #98 - "verify private-zonal-with-networking": + (string) (len=34) "insecureKubeletReadonlyPortEnabled": (bool) false

@apeabody may be related: GoogleCloudPlatform/magic-modules#11688 (except that is for node_config vs node_config_defaults, I think).

(which is potentially my fault originally 😅, and which also maybe ties into GoogleCloudPlatform/magic-modules#11717 as well). I can take a look to see if node_config_defaults is also somehow affected by a similar issue.

If it can safely be made optional, that may help with this issue.

As a potential workaround, making node_config_defaults a dynamic block (only created if a property will be configured), might resolve this diff. Alternatively we could (not ideal) consider always setting insecureKubeletReadonlyPortEnabled to true or false as a breaking change (basically https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/pull/2082).

wyardley commented 2 months ago

Alternatively we could (not ideal) consider always setting insecureKubeletReadonlyPortEnabled to true or false as a breaking change

I've thought about this. Maybe you can find out more internally within Google, but I think (just based on the announcement that came out) there is a plan to flop the default here, so there are a few things that may be at odds here

I kind of feel like at the point where the default becomes to disable this by default in new cluster or cluster versions, it may just create clutter in the module to have the config set explicitly, and in that case, making the cluster more easily take the computed default vs. setting it explicitly might be best / simplest?

In my tests, with >= 6.2, if I'd already set the parameter out of band, the provider did the right thing and pulled the existing value (for insecure_kubelet_readonly_port_enabled) already set on the cluster without me having to set it explicitly.

wyardley commented 2 months ago

I'd be Ok with just closing this PR for now, and then revisiting this and / or #2082 once the various upstream PRs mentioned above are dispositioned one way or another. (esp. since the one bumping GCFS to GA will also change things anyway, assuming we update to a version that includes that change)

apeabody commented 2 months ago

/gcbrun

wyardley commented 1 month ago

Not sure how you want to proceed with this one... I think we'll sort of need it as part of #2013? Maybe could just close this and you can call out the other change in an extra message in the commit body when merging #2013 if needed?

semi-related side note: the GCFS change is also now merged, so that should come out I'd think in 6.5.0.

apeabody commented 1 month ago

Not sure how you want to proceed with this one... I think we'll sort of need it as part of #2013? Maybe could just close this and you can call out the other change in an extra message in the commit body when merging #2013 if needed?

semi-related side note: the GCFS change is also now merged, so that should come out I'd think in 6.5.0.

Sure - That's fine if you want to combine into a single PR. It will be a breaking change, so I want to hold on merging those anyway for another week or so.

wyardley commented 1 month ago

Just to make sure we're talking about the same thing: the GCFS one, I wouldn't do as part of this or that (since it'll likely require 6.5.0, 6.6.0 or something), but I guess what I meant is, the change that's in this PR is, I think already more or less builtin to #2013 even as it is, since I had to move the GA / beta gate around there already. We were originally trying to make this a separate change

As long as the versions we have there don't have the bug with an empty node_config_defaults {} block, I think we are good?

apeabody commented 1 month ago

Just to make sure we're talking about the same thing: the GCFS one, I wouldn't do as part of this or that (since it'll likely require 6.5.0, 6.6.0 or something), but I guess what I meant is, the change that's in this PR is, I think already more or less builtin to #2013 even as it is, since I had to move the GA / beta gate around there already. We were originally trying to make this a separate change

As long as the versions we have there don't have the bug with an empty node_config_defaults {} block, I think we are good?

Yeah, I recall the intention of splitting was to try and merge sooner. But then we discovered it isn't quite out of beta yet, so I think we are good for now!