kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.98k stars 4.65k forks source link

S3 file asset repository CLI unable to read file #16759

Open elliotdobson opened 3 months ago

elliotdobson commented 3 months ago

/kind bug

1. What kops version are you running? The command kops version, will display this information. Client version: 1.29.2 (git-v1.29.2)

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag. Server Version: v1.29.7

3. What cloud provider are you using? AWS

4. What commands did you run? What is the simplest way to reproduce this issue? We are configuring local file asset repository however we are running into an issue when trying to update the cluster.

We have configured an AWS S3 bucket for the file assets to be stored. The S3 bucket is private and has a bucket policy to allow GetObject requests from a VPC Gateway Endpoint that is in the same VPC as the k8s cluster (as vaguely suggested by the docs).

  1. Enable fileRepository in the Cluster spec
  2. Copy the file assets kops get assets --copy
  3. Update the cluster kops update cluster

5. What happened after the commands executed?

Error: you might have not staged your files correctly, please execute 'kops get assets --copy'

With verbose logging it shows:

I0819 11:21:38.966967   90184 builder.go:260] adding remapped file: "https://example-k8s-assets.s3.ap-southeast-2.amazonaws.com/kops/release/v1.29.7/bin/linux/amd64/kubelet"
I0819 11:21:38.967029   90184 builder.go:342] Trying to read hash fie: "https://example-k8s-assets.s3.ap-southeast-2.amazonaws.com/kops/release/v1.29.7/bin/linux/amd64/kubelet.sha256"
I0819 11:21:38.967046   90184 context.go:243] Performing HTTP request: GET https://example-k8s-assets.s3.ap-southeast-2.amazonaws.com/kops/release/v1.29.7/bin/linux/amd64/kubelet.sha256
I0819 11:21:39.106328   90184 builder.go:346] Unable to read hash file "https://example-k8s-assets.s3.ap-southeast-2.amazonaws.com/kops/release/v1.29.7/bin/linux/amd64/kubelet.sha256": unexpected response code "403 Forbidden" for "https://example-k8s-assets.s3.ap-southeast-2.amazonaws.com/kops/release/v1.29.7/bin/linux/amd64/kubelet.sha256": <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>YY7WWWCZC494R0QJ</RequestId><HostId>HiiCNVsfHRNPM/NNOfZf9v67+BTB9REAIEsK4+vW8sS/tWpdgQcuqF1xRmTC47C1H3WOdOTSN7M=</HostId></Error>
I0819 11:21:39.106407   90184 builder.go:361] Unable to read new sha256 hash file (is this an older/unsupported kubernetes release?)
Error: you might have not staged your files correctly, please execute 'kops get assets --copy'

6. What did you expect to happen? kops update cluster to use S3 aware parsing like kops get assets --copy and read the file assets with authenticated requests.

The error is not that surprising since:

  1. the S3 bucket is private.
  2. kOps is using HTTPS URLs to read the objects (so no authentication is passed).
  3. we are running kops from our laptop which is outside the VPC that has access to the file assets S3 bucket.

However since kops get assets --copy worked and the file assets were successfully uploaded to the S3 bucket this was unexpected.

This makes me think that kOps is handling the file asset URLs differently between the two commands. In kops get assets --copy it is using S3 aware parsing and adding authentication to upload the assets, whereas kops update cluster is just doing unauthenticated HTTP request.

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
spec:
...
  assets:
    fileRepository: https://example-k8s-assets.s3.ap-southeast-2.amazonaws.com/kops
...

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

  1. Is it possible to workaround this by using --lifecycle-overrides?
  2. Can kops update cluster use the same S3 awareness as kops get assets --copy?
elliotdobson commented 3 months ago

Looks similar to kubernetes/kops#15104 but unfortunately there is no information on how the issue was resolved.

elliotdobson commented 3 months ago

Looks like kops get assets --copy has a helper function to translate HTTPS URLs into S3 URLs thus the difference in behaviour from kops update cluster.

https://github.com/kubernetes/kops/blob/5d4d867086b7aec87c89cce06ce81fa8b914e54f/pkg/assets/copyfile.go#L179-L220

k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale