Closed djjuhasz closed 2 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Please upload report for BASE (
main@980f78e
). Learn more about missing BASE report. Report is 13 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Have you considered using ephemeral containers? They're available in k8s and compose:
@sevein I haven't considered using ephemeral containers because I didn't know they existed. It sounds like a good solution, but I'm not sure how to get it working with our k3d/Tilt dev environment. I just tried:
kubectl debug -it enduro-am-5dcb8756ff-4vtfr --image=busybox --target=enduro-am-worker
And I get a shell :tada: but when I do ls /home
the directory is empty (I expected a /home/enduro
directory).
I found https://github.com/k3d-io/k3d/discussions/885 which discusses how to get ephemeral containers working with k3d, but I don't really understand the details. It sounds like we may need to create a custom k3d config file that adds the ephemeral container feature gate, but I'm not sure if this actually the problem with my attempt to get a debug shell in the enduro-am container. :confused: Any suggestions?
I haven't tried them myself, but /home/enduro
must be an asset of the enduro-am-worker
image, but you're using busybox
. Not sure if that will work.
@sevein one of my main use cases for wanting a shell is to be able to examine the local copy of a package and make sure the contents look correct. Is it possible to do that kind of thing with an ephemeral debug container? The reason I used the ubuntu base image is so I could shell into the enduro-am worker to check if I could sftp to my host from inside the container, but the busybox image doesn't appear to have an sftp client. :(
I guess I don't really understand the reason for using a distroless container in our dev environment. The reduced attack surface isn't really a concern I don't think because there is no outside access to the environment. The distroless image is smaller (20.7 MB) than the Ubuntu 22.04 image (77.9MB) but I'm not really concerned about an extra 50MB.
On the other side shell access seems like a big positive for a development environment to be able to inspect the internal state of the running containers.
I understand. I think that we'd want to use the distroless image in production, but also using the same image in both prod and dev is a nice to have - if affordable, but if this is slowing you down then maybe it could be addressed in the future.
@sevein one of my main use cases for wanting a shell is to be able to examine the local copy of a package and make sure the contents look correct. Is it possible to do that kind of thing with an ephemeral debug container?
It looks like it's possible, but you'd need to use a shared volume (e.g. see this deployment).
The reason I used the ubuntu base image is so I could shell into the enduro-am worker to check if I could sftp to my host from inside the container, but the busybox image doesn't appear to have an sftp client. :(
You could maybe use ubuntu:latest
instead of busybox
and install whatever package you need.
Okay thanks for your ideas @sevein. I'll talk it over with @jraddaoui when he's back next week and see if he want to stick with distroless or switch to ubuntu (or another base image with glibc and a shell).
@djjuhasz @sevein, hard topic, I didn't know about ephemeral containers nor kubectl debug
either, but I gave it a try following this example. I got pretty close but, when you add a PVC and volumes in the /home
directory to the containers, it overwrites the contents of that directory, including Enduro's binary and configuration. I also tried with kubectl cp
to copy and check the contents locally, but that requires the 'tar' binary in the container.
Why would you need to check the contents of the home directory? AFAIK Enduro's binary and configuration are the only things in there. It could be possible if we want to share and inspect the contents from another directory.
With the ephemeral containers I could see the running enduro-am-worker
process:
$ kubectl debug -it enduro-am-7c84797fc9-7tmb9 --image=busybox:1.28 --target=enduro-am-worker
Targeting container "enduro-am-worker". If you don't see processes from this container it may be because the container runtime doesn't support this feature.
Defaulting debug container name to debugger-5vvvn.
If you don't see a command prompt, try pressing enter.
/ # ps
PID USER TIME COMMAND
1 1000 0:03 /home/enduro/bin/enduro-am-worker --config /home/enduro/.config/enduro.toml
40 root 0:00 sh
46 root 0:00 ps
But I don't see how that could be used to test the SFTP connection, not even in a container with a shell, you'd still need to install the client and set it up. At that point, if you just want to test it works from within the cluster, maybe you could create a temporary image/pod, unrelated to the enduro-am-worker
, but with similar limitations.
Using distroless also worries me about needing other dependencies, thinking about xmllint
for XSD validation in the projects were we are planning to use it or any other dependency like that will require a multi-stage approach were things are installed and copied from another stage.
In any case, I agree with @djjuhasz about having a shell in the development env. and I also agree with @sevein about using distroless in production when possible. It's not that I really like this solution but we could have development targets in the Dockerfile, we already do it with the dashboard, targeting the builder stage instead of the final one to be able to use autoload and live updates:
We could add a base-dev
stage using debian:12-slim
or similar and create three enduro*-dev
targets from it. One of the product trio goals for this quarter is to improve CD/CI in this project, I couldn't put much time lately but I hope to do that soon, which would help testing the distroless production images. Again, not a big fan, but what do you think?
@jraddaoui I'm open using debian:12-slim
but just FYI, the Ubuntu 20.04 images are pretty much the exact same size:
I'm also open to switching back to Alpine Linux, installing python3 and bagit-python, and creating a simple command line wrapper to do bag validation. My experience with the bagit-gython embedded python experiment is that it adds significant complexity without any major benefits over installing and wrapping the Python script.
@jraddaoui as to why I want access to the "/home" directory, I'm trying to access a failed SIP at /home/a3m/.local/share/a3m/share/failed
to figure out why the a3m "verify checksum" task failed. I'm just trying to get access to that file again with the distroless container, and I can't figure out how to get access to the filesystem or the contents. As you mentioned above kubectl cp
requires tar, which isn't installed in the distroless container. :(
You could do that from the a3m
container, it uses Ubuntu 22.04:
https://github.com/artefactual-labs/a3m/blob/main/Dockerfile
You could do that from the
a3m
container, it uses Ubuntu 22.04:https://github.com/artefactual-labs/a3m/blob/main/Dockerfile
Ah, clever @jraddaoui! Thanks, that's a good temporary workaround until we decide on a more permanent solution.
Closing this for now. We may chose to re-open it in the future.
The Ubuntu base image allows shell access to a running continer for debugging purposes, unlike the Google distroless image.