Closed sgaist closed 1 year ago
Small note, as it is a preliminary work I have not written the documentation related to it yet.
Following this month's JupyterHub's meeting, @manics and myself continued a bit on the podman front discussing the podman engine for repo2docker as well as this patch content. One thing that I was not aware of was that beside podman being a drop in replacement for docker in terms of command line interface, it now also implements a docker compatible API from a service point of view. Therefore I ran a BinderHub deployment test with the pink daemonset enabled, the standard Build class, and the image cleaner configured to watch the storage folder of podman.
All these together worked quite nicely.
This means that the environment variable addition is not strictly required for the goal of using podman in place of docker for the pod building the images. However it does allow people to choose whether they want to go full podman, full docker, or a mix of both as it would be the default.
The diff between the dind and pink K8s templates is fairly small
diff --color -Nur helm-chart/binderhub/templates/dind/daemonset.yaml helm-chart/binderhub/templates/pink/daemonset.yaml
--- helm-chart/binderhub/templates/dind/daemonset.yaml 2021-09-18 20:57:59.971717855 +0100
+++ helm-chart/binderhub/templates/pink/daemonset.yaml 2022-09-23 13:29:38.484322020 +0100
@@ -1,20 +1,20 @@
-{{- if .Values.dind.enabled -}}
+{{- if .Values.pink.enabled -}}
apiVersion: apps/v1
kind: DaemonSet
metadata:
- name: {{ .Release.Name }}-dind
+ name: {{ .Release.Name }}-pink
spec:
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
- name: {{ .Release.Name }}-dind
+ name: {{ .Release.Name }}-pink
template:
metadata:
labels:
- name: {{ .Release.Name }}-dind
+ name: {{ .Release.Name }}-pink
app: binder
- component: dind
+ component: dind # Lying to build so current affinity rules can work
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
spec:
@@ -28,47 +28,59 @@
operator: Equal
value: user
nodeSelector: {{ .Values.config.BinderHub.build_node_selector | default dict | toJson }}
- {{- with .Values.dind.initContainers }}
+ {{- with .Values.pink.initContainers }}
initContainers:
{{- . | toYaml | nindent 8 }}
{{- end }}
+ {{- with .Values.pink.daemonset.image.pullSecrets }}
+ imagePullSecrets:
+ {{- . | toYaml | nindent 8 }}
+ {{- end }}
+ securityContext:
+ fsGroup: 1000
+
containers:
- - name: dind
- image: {{ .Values.dind.daemonset.image.name }}:{{ .Values.dind.daemonset.image.tag }}
+ - name: pink
+ image: {{ .Values.pink.daemonset.image.name }}:{{ .Values.pink.daemonset.image.tag }}
+ imagePullPolicy: {{ .Values.pink.daemonset.image.pullPolicy }}
+
resources:
- {{- .Values.dind.resources | toYaml | nindent 10 }}
+ {{- .Values.pink.resources | toYaml | nindent 10 }}
args:
- - dockerd
- - --storage-driver={{ .Values.dind.storageDriver }}
- - -H unix://{{ .Values.dind.hostSocketDir }}/docker.sock
- {{- with .Values.dind.daemonset.extraArgs }}
- {{- . | toYaml | nindent 10 }}
- {{- end }}
+ - podman
+ - system
+ - service
+ - --time=0
+ - unix:///var/run/pink/docker.sock
securityContext:
privileged: true
+ runAsUser: 1000 # podman default user id
+
volumeMounts:
- - name: dockerlib-dind
- mountPath: /var/lib/docker
- - name: rundind
- mountPath: {{ .Values.dind.hostSocketDir }}
- {{- with .Values.dind.daemonset.extraVolumeMounts }}
+ - mountPath: /home/podman/.local/share/containers/storage/
+ name: podman-containers
+ - mountPath: /var/run/pink/
+ name: podman-socket
+
+ {{- with .Values.pink.daemonset.extraVolumeMounts }}
{{- . | toYaml | nindent 8 }}
{{- end }}
- {{- with .Values.dind.daemonset.lifecycle }}
+ {{- with .Values.pink.daemonset.lifecycle }}
lifecycle:
{{- . | toYaml | nindent 10 }}
{{- end }}
terminationGracePeriodSeconds: 30
volumes:
- - name: dockerlib-dind
+ - name: podman-containers
hostPath:
- path: {{ .Values.dind.hostLibDir }}
- type: DirectoryOrCreate
- - name: rundind
+ path: {{ .Values.pink.hostStorageDir }}
+ type: Directory
+ - name: podman-socket
hostPath:
- path: {{ .Values.dind.hostSocketDir }}
- type: DirectoryOrCreate
- {{- with .Values.dind.daemonset.extraVolumes }}
+ path: {{ .Values.pink.hostSocketDir }}
+ type: Directory
+
+ {{- with .Values.pink.daemonset.extraVolumes }}
{{- . | toYaml | nindent 6 }}
{{- end }}
{{- end }}
so a couple of alternative options are:
.Values.dind.securityContext
fully configurable.Values.dind.daemonset.extraVolumeMounts
and .Values.dind.daemonset.extraVolumes
) args:
{{- if eq .Values.containerBuilderPod "dind" }}
- dockerd
- --storage-driver={{ .Values.dind.storageDriver }}
- -H unix://{{ .Values.dind.hostSocketDir }}/docker.sock
{{- end }}
{{- if eq .Values.containerBuilderPod "pink" }}
- podman
- system
- service
- --time=0
- unix:///var/run/pink/docker.sock
{{- end }}
{{- with .Values.dind.daemonset.extraArgs }}
{{- . | toYaml | nindent 10 }}
{{- end }}
If we decide on this route I think deprecating .Values.dind.enabled
and replacing it with something like .Values.containerBuilderPod
will be more extensible in the future.
I like the idea of a configurable containerBuilderPod
Should we also consider modifying the dind
label used to keep build pods on the same node that the builder pod itself ?
Maybe we could do this in two PRs? This one for getting podman to work (so a configurable containerBuilderPod
with no breaking changes). Then a separate PR for renaming the dind
label to something more generic, since it's an internal breaking change- in fact a PR to change the label could be opened now, since it'll be a fairly minor conflict to resolve, and means others can agree/disagree!
As per your excellent idea I merged the two templates into a "new one" that covers both cases and should be easy to extend further.
I haven't touched yet the securityContext part as I wanted to first ensure that the new template would fit the bill.
@manics I think it's a good idea to do the Pod Security Policy work in a separate PR. With the replacement of them in K8s 1.25 for the Pod Security admission controller, it might be better to concentrate on the latter.
@consideRatio Is anything else needed here before merging?
Thanks @manics for the ping, @manics are your unresolved comments ready be resolved, or do they need to be followed up further?
When using pink with it's default settings it's also necessary to override the imageCleaner config.
imageCleaner:
host:
dockerSocket: /var/run/pink/podman.sock
dockerLibDir: /var/lib/pink/storage
Since we're now making a breaking change in this PR could we get rid of these properties and use whatever is configured in dind
/pink
?
I've also just noticed dind uses hostLibDir
for it's storage directory whereas pink uses hostStorageDir
. @consideRatio do you think we should keep this as it represents the underlying container implementation, or use the same property name as an abstraction?
@consideRatio do you think we should keep this as it represents the underlying container implementation, or use the same property name as an abstraction?
Hmm hmm I'm feeling a bit out of my depth here and don't feel ready to opine about it. I think your judgement call will be acceptable no matter what though!
When using pink with it's default settings it's also necessary to override the imageCleaner config.
imageCleaner: host: dockerSocket: /var/run/pink/podman.sock dockerLibDir: /var/lib/pink/storage
Since we're now making a breaking change in this PR could we get rid of these properties and use whatever is configured in dind/pink?
But there was a local situation as well, or? Or is the imageCleaner stuff only activated as part of using dind/pink? I think if imageCleaner is tightly coupled to the use of dind/pink, then relying on configuration under there is the right call.
Funny you bring imageCleaner
on as I was about to ask some questions for it.
To the best of my comprehension, the default deployment for BinderHub would be using the host Docker and a host configured cleaner.
Then you have the option to use dind and that brings two cases to consider:
imageCleaner
host's paths to point to the dind socket and lib folderimageCleaner
's host configurationThe former works because everything happens on the same node and thus all the paths are valid on the hosts. However there's a side effect. There are two containers created in the imageCleaner
pod, one for the host and one for dind which, in the end, are checking the same volumes.
In the second case, only the container watching dind is created.
One thing we could do is refactor the cleaner template logic to follow a pattern similar to the builder daemonset and thus ensure that we have only one container doing the watching and cleanup.
One thing for that that we could do is to remove the loop in the containers part of the template.
Maybe rename local
to host
in the enumeration defined for imageBuilderType
?
Thanks for thinking about this @sgaist!
When using local
, is image building done by the binderhub software in the binder pod - using a local docker daemon? Ideally, the configuration value should clarify that directly from its name. If we update this name, I'd appreciate if we make the name verbose enough to help a person understand more from the name directly.
I think local
means binder_pod_local_docker_daemon
and that the binder pod is mounted a docker daemon from the host or similar, do you think that is that correct @sgaist?
Ah, but I get your point about using host
over local
due to this:
I think as long as we have imageCleaner's terminology of host
, it makes sense to speak of host
instead of local
. If you want to change that would be fine in my mind.
Confusion points are:
hostPath
volumes it seems, "local" and "dind".Oh I get some clarity now! If a node has been running in k8s for a long time, and pulls more and more images because its starting more and more pods of different kinds - it can run out of space. So, image cleaning on the host is not relevant only for where you are building pods using some docker daemon. It can be directly relevant just by being a node that starts a lot of pods.
At the same time, we have this k8s official description, saying that its not recommended to have external tools to do such things: https://kubernetes.io/docs/concepts/architecture/garbage-collection/#containers-images. Hmmm... I think removal of the imageCleaner host option is something worth considering, but its beyond the scope of this PR to consider I think. I see that mybinder.org-deploy has imageCleaner.host.enabled=false.
EDIT: I opened https://github.com/jupyterhub/binderhub/issues/1584 for discussion about this to help us not need to resolve such matters in this PR.
@sgaist / @manics, I know ~nothing about Podman. Should it be exposed to image cleaning, and if so, can it be exposed to image cleaning by our current system running https://github.com/jupyterhub/docker-image-cleaner?
For the purposes of building and storing images podman is the same as Docker- if you build an image it's going to stay around until it's deleted. The socket interface presented by podman is the same as docker, that's why I wasn't bothered about renaming the socket to podman.sock
- as far as the client is concerned (Docker cli, docker-py, repo2docker) it shouldn't matter, except that if the socket isn't mounted to /var/run/docker.sock
the client needs configuration to use the alternative path.
How about we only use imageCleaner.host.*
for the local docker host, and if dind/pink are enabled ignore those values and use dind.{hostSocketDir,hostLibDir}
/ pink.{hostStorageDir,hostSocketDir}
instead, since AFAICT there's no reason to run the cleaner with any other paths?
I think we can simplify some things here indeed.
Do we have an agreement on the following for this patch:
local
to host
for the imageBuilderType
enumerationimageBuilderType
host.enabled
in the same manner as dind.enabled
?
Agreement from my end! Thanks Simon and Samuel!! / Erik on mobile
Sounds fine to me as well, but given the length of this PR how does everyone feel about merging as is, and doing changes in a follow-up?
(Not to be blocking a merge as this can be done after merge)
Merging!
This broke the BinderHub CI upgrade test on main: https://github.com/jupyterhub/binderhub/blob/fb773a0891d0ed84ad397ec4eeabec7b79e2947b/.github/workflows/test.yml#L186
Run . ./ci/common
Installing already released binderhub version 1.0.0-0.dev.git.2920.hc64d689
using old config
Error: INSTALLATION FAILED: execution error at (binderhub/templates/NOTES.txt:194:4):
#################################################################################
###### BREAKING: The config values passed contained no longer accepted #####
###### options. See the messages below for more details. #####
###### #####
###### To verify your updated config is accepted, you can use #####
###### the `helm template` command. #####
#################################################################################
CHANGED:
dind:
enabled: true
must as of version 0.3.0 be replace by
imageBuilderType: dind
On the bright side we've now tested the upgrade message :smiley:
Thanks for following up @manics!!
Thanks for the reviews guys !
@consideRatio I have updated the description and I think it's now complete.
@manics can you double check it in case I missed something ?
@sgaist thanks! Can you provide a leading paragraph, one or two sentence, to introduce what podman is in the PR description as well?
Relevant for the paragraph:
@consideRatio Sure thing ! Done in three sentences but I can rework that if needed.
Perfect @sgaist thanks, looks great!
This pull request is a refactor that allows to use alternatives to Docker as image builder as requested in jupyterhub/binderhub#1513. In this case, support for Podman is implemented.
Podman is daemonless open source alternative to Docker that can be used as drop-in replacement in many cases. Aside from working directly on the command line, it can run as as service (only on Linux at the time of writing) offering a Docker compatible API that can be used in a a similar fashion. This service mode is what is used in this PR to implement the image build support.
The PR implements a rewrite of the DaemonSet providing the docker image build service.
The new
imageBuilderType
parameter introduced will let users select which image builder to use:The daemonset name pink is the acronym of "podman in kubernetes".
Configuration example:
One advantage of this new implementation is that it should allow the addition of new build services without requiring core modifications.
With the addition of Podman as builder option, BinderHub administrators have the option to either use the already provided repo2docker as Podman provides a Docker compatible API or repo2podman that is a plugin for repo2docker using the Podman client. Depending on the use case, some environment variables might be needed to configure it. Thus this patch adds support for them through the new
extra_envs
traits added to theKubernetesBuildExecutor
.As an example of its use:
This example shows a configuration that will enable the
--remote
option for Podman as well provide the path to the credentials to be used when pushing to the registry.