Open segevfiner opened 5 years ago
/sig node
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Yes! Please make this an option... docker has the --init
flag for this purpose. It's very necessary 😄
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Has there been any consideration of this feature request? I believe this would have value.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
it's greatly surprising to find this still open and though I do understand that this makes docker even more a special snowflake case for kubernetes, it does offer a consistent mechanism for apps that need robust zombie reaping and signal handling. Without support for the --init
switch, many will then implement on their own, likely poorly or completely improperly. Consistency and convention would be great here. We add tini to our images for this reason even it's now embedded in Docker. I greatly dislike the redundancy but it's needed currently for k8s deployment. This issue would solve it.
To recap the current available workarounds:
pause
be PID 1 for everything else.tini
as an entry point to the Dockerfile of a an image with a special kubernetes
tag, based on your regular image.tini
as an entry point to the Dockerfile of your regular image, despite it not being needed at all when used with docker run, docker compose, …
- If you don't care much about container isolation inside your pod, switch to shared process namespaces and let
pause
be PID 1 for everything else.
if this was satisafctory, Thomas Orozco wouldn't have spent the time in the first place writing tini, I reckon.
- Manually add
tini
as an entry point to the Dockerfile of a an image with a specialkubernetes
tag, based on your regular image.
That sounds brittle.
- Manually add
tini
as an entry point to the Dockerfile of your regular image, despite it not being needed at all when used with docker run, docker compose, …
This is what we do because it is a tiny redundancy but I'd much rather we didn't need it at all.
it is a tiny redundancy
I see what you did here.
/triage accepted /area kubelet /priority important-longterm
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Is this really needed? The pause container is executed for all pods. It has init signaling and reaps zombie processes.
Only with PID namespace sharing enabled, which is not the default AFAIK and has other implications.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
For the sake of sanity, please disable the bot on issues that has been captured to be important? It adds just noice and makes it very hard to read up on what is actually important.
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
@liggitt , any chance accepting and adding frozen label to this issue, so that bots wont close it?
Thanks for the feature request, but this is unlikely to make progress as an issue. A proposal that works through the design along with the implications of such a change can be opened as a KEP.
I don't think perpetually pinging an issue no one is working on is fruitful
@liggitt I think there's a disconnect between your logic and what a community member/@redbaron expects. The issue was accepted as important by the maintainer team. So I'd expect them to pick this backlog issue up, not say it's important and then wait for the community to plan and do the work. Moreover, the community member didn't only ping, but proposed some backlog management action.
For what it's worth, I disagree this feature is important and I advise against it. A container image should be general to multiple container runtimes. The init executable should be specified by the image builder (as in, developer). Adding this shortcut to Kubernetes adds to the maintenance burden while absolving developers from their own responsibility or awareness about this design point. Moreover, a Container spec is now implicitly declarative and configures an immutable image. Adding some binary into the container by means of a spec field seems to me incongruent. In case this is deemed important because developers are mostly forgetting about the init process, then Containerfile linter tools could be enhanced to warn in advance about resulting entrypoints that are dubious as init processes.
Motivation is the same as for docker run --init
switch existence - everyone needs it, why make image authors do it over and over in every image, when it can be provided out of the box. Also defaults matter, too many devs are not aware of signal handling and it unrealistic to educate them all.
@segevfiner: This issue is currently awaiting triage.
If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
I think being marked important and then not revisited for 18 months is perhaps a sign it is not actually important. I have similar concerns around making init process part of the declarative spec, but will defer to node/kubelet maintainers to weigh in.
I re-marked this as needing triage, to surface it to that queue, but my comment about this being unlikely to make progress as an issue and needing a KEP still stands.
@redbaron
Motivation is the same as for
docker run --init
switch existence - everyone needs it, why make image authors do it over and over in every image, when it can be provided out of the box. Also defaults matter, too many devs are not aware of signal handling and it unrealistic to educate them all.
It could be off-limits due to broadening the scope of the Container spec, as argued. You mention setting a default, but that would be unrealistic anyway since it would be a breaking change to inject an init process before the Command.
How is it a breaking change to introduce a new optional property, that doesn't change anything unless enabled, and that affects the way the container will be run like every other property that exists on the container spec... You can literally emulate it now by modifying the command and volumes yourself, it's just inconvenient.
It's an unfortunate historical mistake IMHO that containers don't include this by default, containers started as a full OS, before Docker introduced the concept of running a single app, but they didn't add this at first and by the time they found out it was a problem, it was too late to change the default, so they introduced and option instead.
If you inject the init executable by default, then it would be breaking, I stated.
Agree with your second point.
The workaround is the same as the solution. Just slam this into the Pod spec:
shareProcessNamespace: true
The workaround is the same as the solution. Just slam this into the Pod spec:
shareProcessNamespace: true
This has other side effects though. Much more likely to break stuff than just injecting an init process.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Any progress on this? It seems adding shareProcessNamespace: true
is not a solution
Can you explain why it doesn't work for your use cases? I have been using this option by default on all deployments since my comment and not had any problems. knock on wood
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
+1, many container orchestration solutions like docker, aws ecs supports it, but just not kubernetes.
This would be really great for using complex scripts in liveness probes. We sometimes need to run a somewhat complex bash script in a liveness probe because the container images we run don't have built-in probes. Without process reaping, we have to make our bash scripts do signal handling themselves, which is messy boilerplate, and hard to maintain in a large open source project where people don't always know to add the signal handling. Being able to use an init: true
boolean without enabling PID sharing would be great.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
What would you like to be added: Add support for adding an init process (PID 1) to containers started by Kubernetes, across the supported container runtimes. This should be toggled by an appropriate flag on the
Container
spec.A possible implementation is basically bind mounting a statically built init executable (e.g.
tini
) into the container and passing the containers entrypoint+cmd to it, which I think is what Docker's implementation of this does. Of course, the exact place to prepend the init command depends on the runtime, with Docker having its entrypoint/cmd separation and others possibly having different models.Why is this needed: Linux containers have a well known problem where due to the first process started in the container having PID 1, that process assumes the roles and responsibilities of
init
, and as such, has the default signal handling behavior disabled, unless they establish signal handles, and in addition have zombies reparented to them for reaping.Well behaved Docker images take this into account and either:
tini
,dumb-init
, etc.Sadly, there are plenty of badly behaving container images out there, and there are also plenty of use cases where requiring the user to add an init to the image is a problem. For example, systems where users submit jobs using their own custom images and commands, which might be less attentive to such issues.
To help with this, Docker added a flag that adds an init process backed by
tini
(Compiled as a vendored static executable named `docker-init) to its API. As far as I can tell, this is implemented by simply bind mounting this binary read only to the container and passing the command that would otherwise be executed as PID 1 to it.This is of course Docker specific and for something like this to exist in Kubernetes it likely needs to be re-implemented by Kubernetes for all supported container runtimes. Though assuming you can bind mount such a binary in all runtimes, and can distribute it, it should be possible... I think...
Note that
tini
, specifically used by Docker, also supports some extra features that can be configured via environment variables that can sometimes be useful for certain kinds of containerizied workloads. Whether that's a good or a bad thing...Of course, this entire thing is quite up for debate...