kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.81k stars 4.64k forks source link

master, 1.19.0-alpha.1 terraform output launch_templates user-data not base64 encoded #9518

Closed fred-vogt closed 4 years ago

fred-vogt commented 4 years ago

KOPS ... update --target=terraform launch templates (using TF v0.12.28)

"InvalidUserData.Malformed: Invalid BASE64 encoding of user data"

TL;DR - TF aws_launch_template user-data files - data/aws_launch_template_master-<zone>.{masters,nodes}.<cluster-name>_user_data aren't base64 encoded anymore

TF error:

Error: InvalidUserData.Malformed: Invalid BASE64 encoding of user data.
    status code: 400, request id: <uuid>

  on kubernetes.tf line NNN, in resource "aws_launch_template" "master-<zone>-masters-<cluster-name>":
 NNN: resource "aws_launch_template" "master-<zone>-masters-<cluster-name>" {

1. What kops version are you running? The command kops version, will display this information. v1.19.0-alpha.1

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag. N/A.

3. What cloud provider are you using? AWS.

4. What commands did you run? What is the simplest way to reproduce this issue?

kops update cluster --name=... --state=... --out=. --target=terraform
terraform apply ...

5. What happened after the commands executed?

Error: InvalidUserData.Malformed: Invalid BASE64 encoding of user data.
    status code: 400, request id: ad50838e-bee3-4b5a-a032-a537bd84ba70

  on kubernetes.tf line ..., in resource "aws_launch_template" "master-<zone>-masters-<cluster-name>":
 479: resource "aws_launch_template" "master-<zone>-masters-<cluster-name>" {

6. What did you expect to happen? Previous kops versions (1-18b2 ?) didn't have this issue when using KOPS_FEATURE_FLAGS=EnableLaunchTemplates.

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information. N/A.

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here. N/A.

9. Anything else do we need to know? With previous working versions of kops (1.18-b1/b2 ?) the files in:

./data/aws_launch_template_master-<zone>.{masters,nodes}.<cluster-name>_user_data

Were base64 encoded. With v1.19.0-alpha.1 the files are plain text.

fred-vogt commented 4 years ago

@rifelpet - Thanks for your efforts my friend. This is the ONLY issue I've found so far with the kops - v1.19.0-alpha.1.

This is with kubernetes - v1.19.0-beta.2.

rifelpet commented 4 years ago

Hi @fred-vogt thanks for the report. Can you paste the aws_launch_template resource definition? It should be using filebase64() which will base64 encode the contents of the file, so that the file is stored in plaintext and the string passed to the user_data argument is base64 encoded.

fred-vogt commented 4 years ago

@rifelpet - Will do. Spoiler alert kops update is not using filebase64. I love the approach of keeping the on disk files "plain" and using filebase64(...) in the terraform output.

resource "aws_launch_template" "master-<zone>-masters-<cluster-fqdn>" {
  block_device_mappings {
    device_name = "/dev/sda1"
    ebs {
      delete_on_termination = true
      volume_size           = 100
      volume_type           = "gp2"
    }
  }
  block_device_mappings {
    device_name  = "/dev/sdc"
    virtual_name = "ephemeral0"
  }
  iam_instance_profile {
    name = "kops-master-<cluster>"
  }
  image_id      = "ami-09b23911f40ae1250"
  instance_type = "c5d.large"
  key_name      = aws_key_pair.kubernetes-<cluster-fqdn>-....id
  lifecycle {
    create_before_destroy = true
  }
  name_prefix = "master-<zone>.masters.<cluster-fqdn>.cicdenv.com-"
  network_interfaces {
    associate_public_ip_address = false
    delete_on_termination       = true
    security_groups             = [aws_security_group.masters-<cluster-fqdn>.id, "..."]
  }
  tag_specifications {
    resource_type = "instance"
    tags = {
      "KubernetesCluster"                                                       = "<cluster-fqdn>.cicdenv.com"
      "Name"                                                                    = "master-<zone>.masters.<cluster-fqdn>.cicdenv.com"
      "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup" = "master-<zone>"
      "k8s.io/role/master"                                                      = "1"
      "kops.k8s.io/instancegroup"                                               = "master-<zone>"
      "kubernetes.io/cluster/<cluster-fqdn>.cicdenv.com"                       = "owned"
    }
  }
  tag_specifications {
    resource_type = "volume"
    tags = {
      "KubernetesCluster"                                                       = "<cluster-fqdn>.cicdenv.com"
      "Name"                                                                    = "master-<zone>.masters.<cluster-fqdn>.cicdenv.com"
      "k8s.io/cluster-autoscaler/node-template/label/kops.k8s.io/instancegroup" = "master-<zone>"
      "k8s.io/role/master"                                                      = "1"
      "kops.k8s.io/instancegroup"                                               = "master-<zone>"
      "kubernetes.io/cluster/<cluster-fqdn>.cicdenv.com"                       = "owned"
    }
  }
  user_data = file("${path.module}/data/aws_launch_template_master-<zone>.masters.<cluster-fqdn>.cicdenv.com_user_data")
}
johngmyers commented 4 years ago

It appears that writeLiteral() in pkg/fi/cloudup/terraform/hcl2.go knows nothing of filebase64. The code in literal.LiteralFileExpression() only puts filebase64 in Value, which is only used for terraform 0.11.

rifelpet commented 4 years ago

Ah you're right, I'll try to get a fix in place soon.

The integration tests clearly use file rather than filebase64 which is incorrect:

https://github.com/kubernetes/kops/blob/c0a2b2d1e9be51cec6e923aba437ade252d65d05/tests/integration/update_cluster/complex/kubernetes.tf#L349

/assign