Open Buchhold opened 10 months ago
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
I created a local solution. I am pretty new to Kubernetes and EG so I don't think this solution generalizes perfectly, but maybe you can point me in the right direction to making this a sensible pull request:
Steps I did:
[...]
volumeMounts:
- name: cri-socket
mountPath: !!str {{ .Values.kip.criSocket }} # see env KIP_CRI_ENDPOINT
readOnly: true
{{- range .Values.global.imagePullSecrets }}
- name: {{ printf "ips-%s" . }}
mountPath: {{printf "/mnt/secrets/%s" . }}
readOnly: true
{{- end }}
volumes:
- name: cri-socket
hostPath:
path: {{ .Values.kip.criSocket }}
{{- range .Values.global.imagePullSecrets }}
- name: {{printf "ips-%s" .}}
secret:
secretName: {{ . }}
optional: true
[...]
@@ -42,6 +44,15 @@ class KernelImagePuller:
self.policy = None
self.image_fetcher = image_fetcher
self.load_static_env_values()
+ self.load_image_pull_secrets()
+
+ def load_image_pull_secrets(self):
+ self.registry_to_auth = {}
+ # Todo: add support for more kinds of secrets
+ for f in glob.glob("/mnt/secrets/*/.dockerconfigjson"):
+ with open(f) as fh:
+ conf = json.load(fh)
+ self.registry_to_auth.update(conf.get("auths", {}))
def load_static_env_values(self):
"""Load the static environment values."""
@@ -245,6 +262,7 @@ class KernelImagePuller:
"""Downloads (pulls) the named image using the configured container runtime."""
result = True
absolute_image_name = self.get_absolute_image_name(image_name)
+ registry = absolute_image_name.split("/")[0]
t0 = time.time()
if self.container_runtime == KernelImagePuller.DOCKER_CLIENT:
try:
@@ -252,7 +270,18 @@ class KernelImagePuller:
except NotFound:
result = False
elif self.container_runtime == KernelImagePuller.CONTAINERD_CLIENT:
argv = ["crictl", "-r", self.runtime_endpoint, "pull", absolute_image_name]
+ if registry in self.registry_to_auth:
+ auth = self.registry_to_auth[registry]
+ if authstr := auth.get("auth"):
+ argv = [
+ "crictl",
+ "-r",
+ self.runtime_endpoint,
+ "pull",
+ "--auth",
+ authstr,
+ absolute_image_name,
+ ]
+ elif "username" in auth and "password" in auth:
+ argv = [
+ "crictl",
+ "-r",
+ self.runtime_endpoint,
+ "pull",
+ "--creds",
+ f'{auth["username"]}:{auth["password"]}',
+ absolute_image_name,
+ ]
result = self.execute_cmd(argv)
else: # invalid container runtime
logger.error(f"Invalid container runtime detected: '{self.container_runtime}'!")
(edited to add a fallback for username:password if no "auth" str is available)
Description
This started as a post asking for help in the community forum: https://discourse.jupyter.org/t/kip-containerd-and-imagepullsecrets/23342. It seems it might not be due to misconfiguration on my end, but a bug in EG.
Using a private repository and projects that require authentication, the KIP appears to not use the ImagePullSecret specified in the helm chart. Thus, pulling public images from the repo works, whereas private ones don't. The image puller logs show failure as:
[E 2024-01-11 09:53:42,617 kernel_image_puller.t1] Error executing crictl -r unix:///run/containerd/containerd.sock pull harbor.[...]: time="2024-01-11T09:53:42Z" level=fatal msg="pulling image: rpc error: code = Unknown desc = failed to pull and unpack image \"harbor[...]kernel-spark-python-custom:2024-01-09\": failed to resolve reference \"harbor.[...]kernel-spark-python-custom:2024-01-09\": pulling from host harbor[...] failed with status code [manifests 2024-01-09]: 401 Unauthorized"
This has a significant impact because pulling those images on kernel creation is also problematic. Unless sharing the namespace with EG, the secret resource is not available in the per-kernel namespaces that are created on launch.
Reproduce
What I configured:
This put me in a position where:
My custom kernelspec image is successfully pulled from a private project from our harbor resgistry. The secret harbor-registey is created in the enterprise-gatewaynamespace by helm. kernel-image-puller-sa is created in the enterprise-gateway namespace and has the imagePullSecret: harbor-registry The logs of the kip pods show that our of 3 custom kernels in my kernelspec, all three are detected and should be pulled. The two from public projects in the registry are pulled successfully (no secret needed) but the one form the private project isn’t: (Error executing crictl -r unix:///run/containerd/containerd.sock pull [...] failed with status code [manifests 2024-01-09]: 401 Unauthorized") It seems the secret + sa work fine to pull pods when creating kubernetes resources (like the kernelspecs image), but are not used when the KIP pods try to pull via the docker client.
Expected behavior
KIP successfully pulls all images, not only those in projects that don't require auth.
Context
EG image: current
elyra/enterprise-gateway:dev