kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.94k stars 4.65k forks source link

KopsControllerConfig: NoSuchEntity Instance Profile #11072

Closed dgamanenko closed 3 years ago

dgamanenko commented 3 years ago
$ kops version
Version 1.19.1 (git-8589b4d157a9cb05c54e320c77b0724c4dd094b2)
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-09T11:26:42Z", GoVersion:"go1.15", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-09T11:18:22Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

cloud provider: AWS

command: kops update cluster

When trying to create an instance with a unique own profile, the following problem occurs

kops create ig  --state "${KOPS_STATE_STORE}" --name "${KOPS_CLUSTER_NAME}" test001
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: myscluster.com
  name: test001
spec:
  cloudLabels:
    k8s.io/cluster-autoscaler/enabled: ""
    k8s.io/cluster-autoscaler/myscluster.com: ""
    k8s.io/cluster-autoscaler/node-template/label: ""
    kubernetes.io/cluster/myscluster.com: owned
  iam:
    profile: arn:aws:iam::<aws-account-id>:instance-profile/k8s-test001
  image: <aws-account-id>/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210119.1
  machineType: m5.xlarge
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: test001
    lifecycle: normal
  role: Node
  rootVolumeSize: 50
  rootVolumeType: gp2
  subnets:
  - us-west-2a-2
  - us-west-2b-2
  - us-west-2c-2
  taints:
  - dedicated=test001:NoSchedule
$ kops update cluster \
# >     --out=. \
# >     --target=terraform \
# >     --name "${KOPS_CLUSTER_NAME}" \
# >     --state "${KOPS_STATE_STORE}"

# *********************************************************************************

# A new kubernetes version is available: 1.19.8
# Upgrading is recommended (try kops upgrade cluster)

# More information: https://github.com/kubernetes/kops/blob/master/permalinks/upgrade_k8s.md#1.19.8

# *********************************************************************************

# error building tasks: error reading manifest addons/kops-controller.addons.k8s.io/k8s-1.16.yaml: error opening resource: error executing resource template "addons/kops-controller.addons.k8s.io/k8s-1.16.yaml": error executing template "addons/kops-controller.addons.k8s.io/k8s-1.16.yaml": template: addons/kops-controller.addons.k8s.io/k8s-1.16.yaml:10:7: executing "addons/kops-controller.addons.k8s.io/k8s-1.16.yaml" at <KopsControllerConfig>: error calling KopsControllerConfig: getting role from profile k8s-test001: NoSuchEntity: Instance Profile k8s-test001 cannot be found.

After adding the instance profile manually (directly in kubernetes.tf) this step was done

resource "aws_iam_instance_profile" "k8s-test001" {
  name = "k8s-test001"
}
terraform apply --target=aws_iam_instance_profile.k8s-test001

kops successfully generated required resources

resource "aws_iam_instance_profile" "k8s-test001" {
  name = "k8s-test001"
  role = aws_iam_role.k8s-test001.name
}

resource "aws_iam_role_policy" "k8s-test001" {
  name   = "k8s-test001"
  policy = file("${path.module}/data/aws_iam_role_policy_k8s-test001_policy")
  role   = aws_iam_role.k8s-test001.name
}

resource "aws_iam_role" "k8s-test001" {
  assume_role_policy = file("${path.module}/data/aws_iam_role_k8s-test001_policy")
  name               = "k8s-test001"
}

After that new ig can't join the cluster

kops validate cluster  --state "${KOPS_STATE_STORE}" --name "${KOPS_CLUSTER_NAME}"
VALIDATION ERRORS
KIND    NAME            MESSAGE
Machine i-12345678901234567 machine "i-12345678901234567" has not yet joined cluster

The solution was a manually edit kops-controller configuration (add k8s-test001 to config and annotations) and recreate kops-controller pods

kubectl edit configmaps -n kube-system kops-controller
apiVersion: v1
data:
  config.yaml: |
    {"cloud":"aws","configBase":"s3://myscluster/myscluster.com","server":{"Listen":":3988","provider":{"aws":{"nodesRoles":["k8s-test001",...
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"config.yaml":"{\"cloud\":\"aws\",\"configBase\":\"s3://myscluster/myscluster.com\",\"server\":{\"Listen\":\":3988\",\"provider\":{\"aws\":{\"nodesRoles\":[\"k8s-test001\",...
  creationTimestamp: "2020-04-13T11:55:59Z"
  labels:
    k8s-addon: kops-controller.addons.k8s.io
  name: kops-controller
  namespace: kube-system
...
$ kubectl get pods -n kube-system | grep kops
kops-controller-bbgpr                                                 1/1     Running   0          26h
kops-controller-j5mkm                                                 1/1     Running   0          26h
kops-controller-qhnzz                                                 1/1     Running   0          26h
kubectl delete pods -n kube-system kops-controller-bbgpr kops-controller-j5mkm kops-controller-qhnzz

all this steps are required from Kops v1.19

h3poteto commented 3 years ago

Refs: https://github.com/kubernetes/kops/pull/10728

Now IAM Profile must exist on AWS before kops update command when user specify IAM Profile. Because kops gets associated IAM Role from specified IAM Profile in update tasks.

Thinking about terraform, should the process of getting IAM Role be executed in kops-controller? (refs: https://github.com/kubernetes/kops/pull/10728#issuecomment-773585055) What do you think about this? @rifelpet @johngmyers

johngmyers commented 3 years ago

If a custom profile is specified in the IG spec, don't that profile and role need to have been previously created external to kops? What is the ownership model here?

Does this work with direct render? Perhaps this is a case of kops having to cache information rather than expect it to have been rendered?

h3poteto commented 3 years ago

If a custom profile is specified in the IG spec, don't that profile and role need to have been previously created external to kops?

No, at the moment, the profile and role must be created.

Does this work with direct render?

I think that it has the same problem even if direct render. However, we often create IAM Role and IAM Profile before kops command when using direct render, so it wasn't a problem.

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 3 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes/kops/issues/11072#issuecomment-946877847): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.