Closed humphd closed 1 month ago
From a cursory reading of the page, I have the same hunch as you, that ECS Exec wants to own pid 1.
This kind of software really needs to stop pretending it provides a container environment when it does not. This is aggravating.
Looking into this a bit more, I wonder if I my Terraform ECS module is causing this I can override it to not happen:
InitProcessEnabled
Run an init process inside the container that forwards signals and reaps processes. This parameter maps to the --init option to docker run. This parameter requires version 1.25 of the Docker Remote API or greater on your container instance. To check the Docker Remote API version on your container instance, log in to your container instance and run the following command: sudo docker version --format '{{.Server.APIVersion}}'
Required: No
Type: Boolean
Update requires: Replacement
I'm also facing the same issue with ECS using EC2 deployment. I'm using Pulumi for the Infra/Stack management and using the following settings in the definition of the task.
containerDefinitions: {
linuxParameters: {
initProcessEnabled: true
}
}
I need to enable this ECS execute command, because I need to execute some commands, and restart S6 services (like s6-svc -r /run/service/serviceName) based on some triggers.
trying to figure out the solution for the same, if anyone has a solution pls post it here.
Thanks
s6-overlay relies on being pid 1 for your container. It cannot work, and will never work, in a situation where another pid 1 is provided by the so-called container manager. Sorry.
If you need to run early commands that don't fit with the s6-overlay model, the only suitable place is S6_STAGE2_HOOK
. If you need to restart services depending on triggers, you can always define other services that listen to triggers and send s6-svc -r
commands to services you choose.
@skarnet - Thanks for the confirmation and the suggestion.
I wrote a Go program to listen in TCP and execute the payload as/from whitelisted system commands, and this listener runs as another S6 service in the container.
Sounds good!
Just in case, know that the s6 suite provides most of the infrastructure for this:
whitelist
, that checks its command line (or simply its first argument) against a database of whitelisted commands, then executes it. You would run it as s6-tcpserver $ip $port s6-tcpserver-access -i $accesscontrolfolder s6-sudod whitelist
, and the client would run s6-tcpclient $ip $port s6-sudoc $commandline
. It requires s6 and s6-networking to be present on the client, though, which may not be viable for you.Not sure if this is helpful or not, but I've started using ECS Exec in the last month without issue.
So far only tried on Fargate without setting anything for initProcessEnabled (does it default to false?).
Maybe AWS changed something, or maybe something is different with your setup, but just wanted to let you know it does seem possible to make it work with s6-overlay out of the box.
If it works, then it is almost certainly that initProcessEnabled
is false
.
Can I close this issue?
I just created a new cluster with fargate as capacity and initProcessEnabled
disabled and I'm also getting the s6-overlay-suexec: fatal: can only run as pid 1
error.
I investigated a little bit and pid 1 is taken by /pause
process which comes from the amazon-ecs-pause
container that is used by AWS ECS networking (https://aws.amazon.com/blogs/compute/under-the-hood-task-networking-for-amazon-ecs/). For each task launched on ECS, it seems that a pause container is launched and the pid namespace is shared between that pause container and the one running the task.
EDIT sorry this is my bad, I misread AWS documentation (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#task_definition_pidmode) which states On Fargate for Linux containers, the only valid value is task. but this is also possible not to specify this value. By removing it, everything seems to work as expected.
Glad you made it work!
I'm using s6-overlay in containers that I'm running on AWS ECS. I was hoping to use ECS Exec for remote debugging in the hosted containers, but I can't get my containers to start when this is enabled.
I suspect that ECS Exec wants to own pid 1, which is breaking s6-overlay, but I wanted to see if anyone had recommendations on how to get these to play nicely together.
Thanks for s6-overlay, it's been amazing for us!