Closed brandond closed 1 year ago
This may be an issue with the example configuration from the upstream docs, or perhaps with the plugins themselves. There was someone on Users Slack who wrote a shell script wrapper around amazon-ecr-credential-helper and got it working, after reporting that the upstream ECR and GCR plugins are essentially broken at the moment.
I see the correct behavior from the kubelet (plugin is used and auth provided) when using the following config and dummy plugin; I will need to work with @rancher-max to set up another test on ECR to figure out what's going on over there. I suspect the ECR plugin may not be functional yet.
kind: CredentialProviderConfig
apiVersion: kubelet.config.k8s.io/v1alpha1
providers:
- name: test.sh
matchImages:
- "docker.io"
defaultCacheDuration: "12h"
apiVersion: credentialprovider.kubelet.k8s.io/v1alpha1
args:
- get-credentials
env:
- name: TEST
value: "TEST"
#!/bin/bash
date &>> /tmp/credential.log
env &>> /tmp/credential.log
jq . &>> /tmp/credential.log
echo '{
"kind": "CredentialProviderResponse",
"apiVersion": "credentialprovider.kubelet.k8s.io/v1alpha1",
"cacheKeyType": "Image",
"cacheDuration": "5s",
"auth": {
"docker.io": {
"username": "myuser",
"password": "mypass"
}
}
}'
I believe the failure @rancher-max and I saw was due to https://github.com/kubernetes/kubernetes/issues/102750
@brandond Yes your RCA is correct it's due to https://github.com/kubernetes/kubernetes/issues/102750
Moving this back into backlog pending an upstream fix to the kubelet
Fix for the upstream issue didn't make it in to 1.22.0; it looks like it'll probably be in 1.22.1 and backported to the next 1.21 patch release.
@brandond , is there an update on this?
The issue I linked up above was only fixed on master for 1.23. I've pinged the PR author a couple times both on GH and on Kubernetes Slack but have not made any progress towards getting the fix backported to 1.22 or 1.21: https://github.com/kubernetes/kubernetes/pull/103231
Also, the GCR and ECR plugins are in middling states of usability, and the ACR one isn't due until December...
Upstream has declined to backport the fixes for this to 1.22, as it's an alpha feature. Kubelet credential provider plugins won't be functional until 1.23.0
@rancher-max Can we retest this on the next milestone?
@rancher-max and I identified another issue with upstream credential provider support. The credential provider plugin is only called to provide credentials to pull images specifically referenced by pod specs. It is not ever called to pull the pause image. This means that the pause image must be available anonymously, or credentials for pulling the pause image must be provided via containerd registries.yaml.
We could theoretically use the --pause-image
flag to point k3s at a pause image that can be pulled anonymously, while pulling the rest of the images off the --system-default-registry
, but there is a bug in the way that these two flags interact that prevents that from working properly.
Moved this to Stalled for now as there isn't much more for me to test, but feel free to change the status when it gets picked up (including if we end up making changes to upstream). Thank you again Brad for your help in figuring out what the issue was here!
@galal-hussein what is the status of this? Can we bump it to the next stage?
Upstream is still stalled on general support for this issue.
FWIW, it got GA'd in 1.26 https://kubernetes.io/blog/2022/12/22/kubelet-credential-providers/
@mkmik yes, but as far as I can tell they still haven't fixed the issue of the credential provider not being consulted for the pause image, so it's not really usable in environments where all images require credentials.
Would there be value in providing partial support for the feature?
Define "partial support". K3s has supported this (as in, it is in the code base and works) for a while. The functionality will not go away, I just don't find it very useful due to this limitation in how upstream has integrated it.
Feature did not graduate and the feature-gate has been removed starting with 1.28
Originally posted by @rancher-max in https://github.com/k3s-io/k3s/issues/3280#issuecomment-843644882
I've validated standard airgap testing in v1.21.1-rc1+k3s1. This continues to work with tarball method, private registry in registries.yaml, and now also works with system-default-registry flag.
The
image-credential-provider
stuff on the kubelet is not working, even with the featuregate turned on. This appears to be an upstream issue, as using the same configurations with wharfie directly works. The error I'm seeing is a 401 Unauthorized error when trying to pull the images. Using config file:With that
ecr-credential-provider-amd64
binary pulled from: https://github.com/rancher/wharfie/releases/tag/v0.3.5.Bringing up k3s with flag:
--system-default-registry=<account>.dkr.ecr.<region>.amazonaws.com
where all the necessary k3s images are present in that registry.