Open vitaliyf opened 23 hours ago
Workaround: use awsudo
or other workarounds from https://kops.sigs.k8s.io/mfa/#the-workaround-2
$ awsudo company-name-dev3 kops_v1.30.1 update cluster
...
+ NODEUP_URL_AMD64=https://artifacts.k8s.io/binaries/kops/1.30.1/linux/amd64/nodeup,https://github.com/kubernetes/kops/releases/download/v1.30.1/nodeup-linux-amd64
- NODEUP_URL_AMD64=https://artifacts.k8s.io/binaries/kops/1.29.2/linux/amd64/nodeup,https://github.com/kubernetes/kops/releases/download/v1.29.2/nodeup-linux-amd64
...more as-expected output..
Must specify --yes to apply changes
/kind bug
1. What
kops
version are you running? The commandkops version
, will display this information.Testing upgrade from
Client version: 1.29.2 (git-v1.29.2)
toClient version: 1.30.1 (git-v1.30.1)
2. What Kubernetes version are you running?
kubectl version
will print the version if a cluster is running or provide the Kubernetes version specified as akops
flag.v1.29.9
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
kops_v1.30.1 update cluster
- no other changes to manifest or environment, only executing newer kops binary.5. What happened after the commands executed?
$ export AWS_PROFILE=company-name-dev3
$ kops_v1.30.1 update cluster
SDK 2024/09/20 14:31:06 DEBUG request failed with unretryable error https response error StatusCode: 403, RequestID: 623bd87e-11e1-4b06-9f16-10f60ba2f030, api error AccessDenied: User: arn:aws:sts::[redacted]006:assumed-role/OrganizationAccountAccessRole/aws-go-sdk-1726842666098977639 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::[redacted]006:role/OrganizationAccountAccessRole Error: error determining default DNS zone: error querying zones: error listing hosted zones: operation error Route 53: ListHostedZones, get identity: get credentials: failed to refresh cached credentials, operation error STS: AssumeRole, https response error StatusCode: 403, RequestID: 623bd87e-11e1-4b06-9f16-10f60ba2f030, api error AccessDenied: User: arn:aws:sts::[redacted]006:assumed-role/OrganizationAccountAccessRole/aws-go-sdk-1726842666098977639 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::[redacted]006:role/OrganizationAccountAccessRole
6. What did you expect to happen?
With kops-1.29.2 the output shows proposed changes that need to be applied with
--yes
AWS CLI is able to successfully get Route53 zones from the same shell:
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest. You may want to remove your cluster name and other sensitive information.8. Please run the commands with most verbose logging by adding the
-v 10
flag. Paste the logs into this report, or in a gist and provide the gist link here.https://gist.github.com/vitaliyf/cfddd9ad771ee613ee850bb9e2d3fe14
9. Anything else do we need to know?
This cluster has been continuously upgraded one kops/kubernetes version at a time for at least a couple years, so it is pretty routine for us to test and execute such upgrades in-place.
I tried to look around and I suspect this is related to aws-sdk-go-v2 upgrade.
For example, they have this issue: https://github.com/aws/aws-sdk-go-v2/issues/2686 - and coincidentally or not, that ticket is referenced by https://github.com/cert-manager/cert-manager/pull/7236 where they are also dealing with "Missing Region" error just like https://github.com/kubernetes/kops/issues/16645 from kops-1.30.0