kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.94k stars 4.65k forks source link

panic: Terraform resource names cannot start with a digit. This is a bug in Kops, please report this in a GitHub Issue. Name: 208.*.*.net #12199

Closed vladimirkus closed 3 years ago

vladimirkus commented 3 years ago

/kind bug

1. What kops version are you running? The command kops version, will display this information. Version 1.21.0 (git-a5bdc3359e544b314d5695b3ed596829313fc6e3)

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag. k8s: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.3", GitCommit:"ca643a4d1f7bfe34773c74f79527be4afd95bf39", GitTreeState:"clean", BuildDate:"2021-07-15T20:59:07Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}

3. What cloud provider are you using? AWS

4. What commands did you run? What is the simplest way to reproduce this issue? kops update cluster 208.-.-.net --yes --out output/208/ --target=terraform

5. What happened after the commands executed? Kops exited with panic on cluster update.

6. What did you expect to happen? I expected kops to update cluster config

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information. I'm trying to update the existing cluster to enable IRSA Cluster spec was updated with the next:

spec:
    serviceAccountIssuerDiscovery:
        discoveryStore: s3://public-bucket
        enableAWSOIDCProvider: true

And

iam:
    serviceAccountExternalPermissions:
      - name: accountname
        namespace: namespace
        aws:
          inlinePolicy: |-
            [
              {
                "Effect": "Allow",
                "Action": "sts:AssumeRole",
                "Resource": "arn:aws:iam::000000000:role/somerole"
              }
            ]

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here.

panic: Terraform resource names cannot start with a digit. This is a bug in Kops, please report this in a GitHub Issue. Name: 208.*.*.net

goroutine 1 [running]:
k8s.io/kops/upup/pkg/fi/cloudup/terraformWriter.tfSanitize(0xc0001339b0, 0x13, 0x57f4434, 0x1f)
    upup/pkg/fi/cloudup/terraformWriter/writer.go:59 +0x30c
k8s.io/kops/upup/pkg/fi/cloudup/terraformWriter.(*TerraformWriter).GetResourcesByType(0xc00057a300, 0xc001b74300, 0x1, 0xc000966810)
    upup/pkg/fi/cloudup/terraformWriter/writer.go:144 +0xee
k8s.io/kops/upup/pkg/fi/cloudup/terraform.(*TerraformTarget).finishHCL2(0xc00057a300, 0xc0010854a0, 0x0, 0x4e697c0)
    upup/pkg/fi/cloudup/terraform/target_hcl2.go:57 +0x592
k8s.io/kops/upup/pkg/fi/cloudup/terraform.(*TerraformTarget).Finish(0xc00057a300, 0xc0010854a0, 0xa, 0xc000133900)
    upup/pkg/fi/cloudup/terraform/target.go:88 +0x72
k8s.io/kops/upup/pkg/fi/cloudup.(*ApplyClusterCmd).Run(0xc00161bc38, 0x60a0670, 0xc0000521b8, 0x0, 0x0)
    upup/pkg/fi/cloudup/apply_cluster.go:806 +0x21c6
main.RunUpdateCluster(0x60a0670, 0xc0000521b8, 0xc000b9e000, 0x7ffeefbff0a4, 0x13, 0x604b3a0, 0xc00000e018, 0xc0002ed0e0, 0x5, 0x57a5c1f, ...)
    cmd/kops/update_cluster.go:286 +0x695
main.NewCmdUpdateCluster.func1(0xc0000dcc80, 0xc000067300, 0x1, 0x8)
    cmd/kops/update_cluster.go:113 +0x115
k8s.io/kops/vendor/github.com/spf13/cobra.(*Command).execute(0xc0000dcc80, 0xc000067280, 0x8, 0x8, 0xc0000dcc80, 0xc000067280)
    vendor/github.com/spf13/cobra/command.go:856 +0x2c2
k8s.io/kops/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x845fac0, 0x84af550, 0x0, 0x0)
    vendor/github.com/spf13/cobra/command.go:960 +0x375
k8s.io/kops/vendor/github.com/spf13/cobra.(*Command).Execute(...)
    vendor/github.com/spf13/cobra/command.go:897
main.Execute()
    cmd/kops/root.go:97 +0x8f
main.main()
    cmd/kops/main.go:24 +0x25

9. Anything else do we need to know? These are the only changes that have been made to cluster prior to the panic. "Name" is the name of the cluster.

rifelpet commented 3 years ago

Just to confirm, your cluster name begins with 208 ?

Was this working in kOps 1.20 or is this a brand new cluster in kOps 1.21? As the error suggests, terraform no longer allows resource names to begin with digits. Looking at some test outputs, the AWS resource types that use names that are just the cluster name are:

vladimirkus commented 3 years ago

@rifelpet, yes, cluster name begins with 208. Cluster was working on kOps 1.20 before, it's a pretty old cluster that is up from 1.17 Without adding IRSA configuration to cluster config kops update runs well, because of that I got the impression that this issue may be connected to the resources that participate in IRSA support.

vladimirkus commented 3 years ago

I see, so the solution may be to add prefixes to the resource names for the resource types that use names that are just the cluster name. In this exact case aws_iam_openid_connect_provider.

rifelpet commented 3 years ago

Yes, one option is we add a prefix to each terraform resource name that would otherwise begin with the cluster name. I opened a PR linked above that demonstrates the problem and will hopefully catch this in the future.

rifelpet commented 3 years ago

https://github.com/kubernetes/kops/pull/12202 is my attempt at fixing this. The third commit in that PR shows the exact terraform changes, and the necessary terraform state mvs are in the PR description