Closed baryluk closed 3 years ago
a reminder that the project is called dumb init -- intentionally not supporting things like reloaders, restarters, etc. but just the bare minimum features of an init system.
if you want a reloader, I'd recommend writing a small thing which responds to SIGUSR1 and runs os.exec*
(this thing can easily run as a direct child of dumb-init and a direct parent of your application
I wish that was easy, but it is not.
dumb-init -- bash exec service
directly.I even tried doing this for airflow, with airflow-git-tag-2.1.2/scripts/in_container/prod/entrypoint_prod.sh:360
modified to have this at the end instead of normal exec airflow "$@"
:
# ...
AIRFLOW_PID=
RELOADING=0
doreload() {
RELOADING=1
kill "${AIRFLOW_PID}"
}
while :; do
RELOADING=0
"airflow" "${@}" &
AIRFLOW_PID=$!
trap doreload SIGUSR1
wait ${AIRFLOW_PID}
STATUS=$?
if [ "${RELOADING}" != "1" ]; then
exit "${STATUS}"
break
fi
done
and it does not work, because the helm chart I am using bypasses the airflow provided entrypoint ( https://github.com/airflow-helm/charts/blob/main/charts/airflow/templates/webserver/webserver-deployment.yaml#L106 ), and as far as I know, there is no mechanism for me to change it.
shouldn't be too hard to change the helm chart or write your own airflow
binary prefixed on PATH
?
shouldn't be too hard to change the helm chart or write your own
airflow
binary prefixed onPATH
?
On one-by-one basis yes (and it is not trivial). But that does not help much with other cases and other images.
Also the PATH
is "/home/airflow/.local/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", and airflow is in the first directory. So I also need to modify the PATH
somehow, either in the helm release use (no idea if this will even work properly), or create a custom image, that moves things around. Fixable, but tedious.
In this particular case I figured how to do it:
In Dockerfile at the end:
USER root
ENV PATH="/reloader:${PATH}"
# Install reloader.sh as "airflow" as the first element in the PATH.
# This way when helm chart does 'exec bash -c airflow',
# it will be picked up first.
# The reloader later will run airflow via absolute path.
COPY reloader.sh /reloader/airflow
USER airflow
and reloader.sh like this
#!/bin/bash
REAL_AIRFLOW="/home/airflow/.local/bin/airflow"
doreload15() {
RELOADING=1
kill "${AIRFLOW_PID}"
}
doreload9() {
RELOADING=1
kill -9 "${AIRFLOW_PID}"
}
while :; do
RELOADING=0
"${REAL_AIRFLOW}" "${@}" &
AIRFLOW_PID=$!
trap doreload15 SIGUSR1 # Send SIGTERM
trap doreload9 SIGUSR2 # Send SIGKILL
wait "${AIRFLOW_PID}"
STATUS=$?
if [ "${RELOADING}" != "1" ]; then
exit "${STATUS}"
break
fi
sleep 5 # Give some time for the child worker process to terminate too.
done
It works. I still think it is a PITA, and could be solved more universally in dumb-init
.
I am not even sure if running kill inside a signal handler is allowed, or maybe bash does something special, and just captures the signals, and back in the main interpreter loop it the handler function is run safely.
could be solved more universally in dumb-init.
it could, dumb-init could also be a process watcher or a service runner or systemd, but it's intentionally not
could be solved more universally in dumb-init.
it could, dumb-init could also be a process watcher or a service runner or systemd, but it's intentionally not
Because these things are better solved in other places. That is not the case for my use case.
it absolutely is in your case, you're looking for functionalities that a process supervisor would implement -- so use a process supervisor!
it absolutely is in your case, you're looking for functionalities that a process supervisor would implement -- so use a process supervisor!
How I use a process supervisor in ~30 different container images that we use, majority of which I have zero extra control over? I honestly do not see how this can be done, especially when running on kubernetes. I am open for suggestions.
tbh I'm concerned for your organization:
I'd start by trying to fix those instead of demanding features on tools which are designed to not have them
So how would you do it then? I still do not see how it is possible.
dumb-init
=> (supervisor of your choice) => (entrypoint of your image)
but more realistically, I would not be debugging in production -- I'd use a development environment for that and run tools in the foreground, etc.
dumb-init => (supervisor of your choice) => (entrypoint of your image)
Not possible with any of docker images on hub.docker.com , or hundrets of other images that you build using open source Dockerfiles
but more realistically, I would not be debugging in production -- I'd use a development environment for that and run tools in the foreground, etc.
sometimes that is simply not possible. there are sometimes issues in production that cannot be reproduced in local development environment, or even mirrored environment. Sometimes (once every few years from my experience) there is a serious production issue, that cannot be figured out by other means quickly, and needs to be debugged in production. This is reality.
Thanks for the issue but I have to agree this is out-of-scope for dumb-init. If you need any kind of process supervision functionality, it would be better to use a different tool.
@chriskuehl I understand, but I should clarify. I do not need or want any kind of process supervision functionality tho. I do not want for any automatic restarting or monitoring of the main process. I do not find the requested functionality to be really "process supervision".
I guess, I would find some workaround in the future. Thanks.
Backstory: I am using Apache Airflow 2.1.2, on kubernetes (AWS EKS). As I am migrating from airflow 1.10.15, there is many issues, that are hard to debug locally, as there are many interactions with extra resources (secrets, exact environment variables used, k8s, production database, etc). I fixed many of these issues by trial and error, on test instances, but there is one that is really hard to debug. Related to s3 remote logging in airflow. From all I can see (local testing, emitted logs, triple checking configs, and logging to pod and manually doing import boto3, creating sessions with same keys, and executing same commands as airflow code), it should work, but it does not. So I want to instrument / modify the actual process to debug it deeper.
I wish I was able to log into the pod, modify some python files (possibly prepared externally then copied using
wget + tar xf
to get them, or edited locally using something likenano
,mcedit
orvi
, or transfered usingkubectl cp ...
), to add various debug features, and askdumb-init
explicitly to restart the child, without terminating the pod. I cannot just restart the pod, because if it terminats, k8s will restart it from scratch, with any custom debug modifications gone.This is the process structure in the pod:
I want to restart process 7, so it can re-read the modified python files.
The alternatives are not so pleasant:
Also note that default uid is airflow (50000), which makes even doing
apt install
very tricky on the running pod.So I think ability to modify ad-hoc the files on the pod, and asking
dumb-init
to restart the main process, is most straightforward, requires no extra preparation steps, and is safest.If after the modifications, and explicitly manual restart request, the main process exits or crashes,
dump-init
should behave same as before, i.e. exit and terminate the pod.I think the feature is desirable, and keeps
dumb-init
still relatively dumb.PS. It is somehow similar to https://github.com/Yelp/dumb-init/issues/156 but different in purpose.