kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.94k stars 4.65k forks source link

Unable to complete "Required Actions" on 1.18.3 with "error reading cluster configuration" #11699

Closed ironmike-au closed 2 years ago

ironmike-au commented 3 years ago

/kind bug

1. What kops version are you running? The command kops version, will display this information. Version 1.18.3 (git-11ec695516)

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag.

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.12", GitCommit:"5ec472285121eb6c451e515bc0a7201413872fa3", GitTreeState:"clean", BuildDate:"2020-09-16T13:32:12Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

3. What cloud provider are you using? AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

export KOPS_STATE_STORE="s3:/my-bucket-name"
KOPS_FEATURE_FLAGS=-Terraform-0.12 kops-1.18.3 update cluster --target terraform ...

5. What happened after the commands executed? We receive an error:

❯ KOPS_FEATURE_FLAGS=-Terraform-0.12 kops-1.18.3 update cluster --target terraform ...

I0606 06:25:01.334187   33665 featureflag.go:154] FeatureFlag "Terraform-0.12"=false

error reading cluster configuration: error reading cluster configuration "../..": error reading s3://my-bucket-name/../../config: error fetching s3://my-bucket-name/../../config: BucketRegionError: incorrect region, the bucket is not in 'ap-southeast-2' region at endpoint ''
        status code: 301, request id: , host id:

6. What did you expect to happen? To be able to complete the required actions

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: "2019-11-07T03:19:32Z"
  generation: 25
  name: my-cluster-name
spec:
  additionalPolicies:
    master: |
      <policy omitted>
    node: |
      <policy omitted>
  api:
    loadBalancer:
      additionalSecurityGroups:
      - omitted
      idleTimeoutSeconds: 3600
      type: Internal
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://my-bucket-name/my-cluster-name
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - encryptedVolume: true
      instanceGroup: master-ap-southeast-2a
      name: a
    - encryptedVolume: true
      instanceGroup: master-ap-southeast-2b
      name: b
    - encryptedVolume: true
      instanceGroup: master-ap-southeast-2c
      name: c
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - encryptedVolume: true
      instanceGroup: master-ap-southeast-2a
      name: a
    - encryptedVolume: true
      instanceGroup: master-ap-southeast-2b
      name: b
    - encryptedVolume: true
      instanceGroup: master-ap-southeast-2c
      name: c
    memoryRequest: 100Mi
    name: events
  fileAssets:
  - <audit policy omitted>
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeAPIServer:
    auditLogMaxAge: 1
    auditLogMaxBackups: 14
    auditLogMaxSize: 100
    auditLogPath: /var/log/kube-apiserver-audit.log
    auditPolicyFile: /srv/kubernetes/audit.yaml
    featureGates:
      TTLAfterFinished: "true"
  kubeControllerManager:
    featureGates:
      TTLAfterFinished: "true"
  kubelet:
    anonymousAuth: false
    featureGates:
      ExpandInUsePersistentVolumes: "true"
      RunAsGroup: "true"
      TTLAfterFinished: "true"
  kubernetesApiAccess:
  - omitted
  kubernetesVersion: 1.17.12
  masterInternalName: api.internal.my-cluster-name
  masterPublicName: api.my-cluster-name
  networkCIDR: 10.102.0.0/16
  networkID: omitted
  networking:
    amazonvpc: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - omitted
  subnets:
  - cidr: 10.102.48.0/20
    id: omitted
    name: omitted
    type: Private
    zone: ap-southeast-2a
  - cidr: 10.102.64.0/20
    id: omitted
    name: omitted
    type: Private
    zone: ap-southeast-2b
  - cidr: 10.102.32.0/22
    id: omitted
    name: ap-southeast-2a
    type: Private
    zone: ap-southeast-2a
  - cidr: 10.102.36.0/22
    id: omitted
    name: ap-southeast-2b
    type: Private
    zone: ap-southeast-2b
  - cidr: 10.102.40.0/22
    id: omitted
    name: ap-southeast-2c
    type: Private
    zone: ap-southeast-2c
  - cidr: 10.102.92.0/22
    id: omitted
    name: utility-ap-southeast-2a
    type: Utility
    zone: ap-southeast-2a
  - cidr: 10.102.96.0/22
    id: omitted
    name: utility-ap-southeast-2b
    type: Utility
    zone: ap-southeast-2b
  - cidr: 10.102.100.0/22
    id: omitted
    name: utility-ap-southeast-2c
    type: Utility
    zone: ap-southeast-2c
  topology:
    dns:
      type: Public
    masters: private
    nodes: private

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2019-11-07T03:19:32Z"
  labels:
    kops.k8s.io/cluster: my-cluster-name
  name: master-ap-southeast-2a
spec:
  associatePublicIp: false
  image: kope.io/k8s-1.14-debian-stretch-amd64-hvm-ebs-2019-08-16
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-ap-southeast-2a
  role: Master
  rootVolumeSize: 50
  subnets:
  - ap-southeast-2a

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2019-11-07T03:19:32Z"
  labels:
    kops.k8s.io/cluster: my-cluster-name
  name: master-ap-southeast-2b
spec:
  associatePublicIp: false
  image: kope.io/k8s-1.14-debian-stretch-amd64-hvm-ebs-2019-08-16
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-ap-southeast-2b
  role: Master
  rootVolumeSize: 50
  subnets:
  - ap-southeast-2b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2019-11-07T03:19:32Z"
  labels:
    kops.k8s.io/cluster: my-cluster-name
  name: master-ap-southeast-2c
spec:
  associatePublicIp: false
  image: kope.io/k8s-1.14-debian-stretch-amd64-hvm-ebs-2019-08-16
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-ap-southeast-2c
  role: Master
  rootVolumeSize: 50
  subnets:
  - ap-southeast-2c

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2019-11-07T03:19:32Z"
  generation: 13
  labels:
    kops.k8s.io/cluster: my-cluster-name
  name: nodes
spec:
  additionalSecurityGroups:
  - omitted
  associatePublicIp: false
  image: kope.io/k8s-1.14-debian-stretch-amd64-hvm-ebs-2019-08-16
  machineType: r5.xlarge
  maxSize: 12
  minSize: 12
  nodeLabels:
    kops.k8s.io/instancegroup: nodes
  role: Node
  rootVolumeSize: 200
  subnets:
  - omitted
  - omitted

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here.

❯ KOPS_FEATURE_FLAGS=-Terraform-0.12 kops-1.18.3 -v10 update cluster --target terraform ...

I0606 06:33:10.188139   35314 featureflag.go:154] FeatureFlag "Terraform-0.12"=false
I0606 06:33:10.214293   35314 factory.go:68] state store s3://my-bucket-name
I0606 06:33:10.735491   35314 s3context.go:213] found bucket in region "ap-southeast-2"
I0606 06:33:10.735514   35314 s3fs.go:290] Reading file "s3://my-bucket-name/../../config"

error reading cluster configuration: error reading cluster configuration "../..": error reading s3://my-bucket-name../../config: error fetching s3://my-bucket-name/../../config: BucketRegionError: incorrect region, the bucket is not in 'ap-southeast-2' region at endpoint ''
        status code: 301, request id: , host id:

9. Anything else do we need to know? Initially this command was run through an SSO login and I though that might be a problem, so I started fresh with a direct IAM account configured simply using aws configure, but this didn't make any difference.

I've confirmed that the cluster is healthy and normal kops commands run fine (such as kops-1.18.3 validate cluster)

I've also confirmed that my account can pull perform the get-bucket-location command, though I'm not sure how relevant that is.

aws s3api get-bucket-location --bucket=my-bucket-name
{
    "LocationConstraint": "ap-southeast-2"
}

The bucket is in the same region as and AWS account as everything else, so there should be no cross-account permissions issues.

Thanks in advance for any help you can offer!

olemarkus commented 3 years ago

This bug is filed against an unsupported kOps version. At minimum upgrade to 1.20 and see if the behavior persists.

ironmike-au commented 3 years ago

Is it safe to upgrade to kops 1.20 without completing the upgrade instructions for 1.18?

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes/kops/issues/11699#issuecomment-962737500): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.