Closed Lewiscowles1986 closed 5 years ago
Interesting, had no idea that existed. Looks like it's been there since 2015: https://github.com/moby/moby/blob/4b18f8c4bf63ef27045b79fb110f4b84c1dc1819/CHANGELOG.md#190-2015-11-03
@cirocosta Any thoughts on this? Would it make sense for k8s too?
edit: also would you mind doing the DCO process? It's pretty lightweight: https://github.com/concourse/concourse-docker/pull/44/checks?check_run_id=119750704
Thank you for :eyes: and feedback.
DCO Done.
@cirocosta Any thoughts on this? Would it make sense for k8s too?
FWIW I've never seen any other application pass this signal via Dockerfile. What it ensures is that docker stop
and docker-compose down
should behave.
Hi @Lewiscowles1986 , thanks for the contribution!
Interesting, had no idea that existed. Looks like it's been there since 2015
Interestingly, it only got to the OCI spec two years later: https://github.com/opencontainers/image-spec/pull/544, in which case, cri-o
followed upon quickly after.
@cirocosta Any thoughts on this? Would it make sense for k8s too?
hmmmmm, it makes sense for those expecting a scale-in / pod-(termination|eviction)
to retire
all the time, but I'd rather have that more explicitly set and controlled with a preStop
hook or something (like we do in the Helm Chart).
## Signal to send to the worker container when shutting down.
## Possible values:
##
## - SIGUSR1: land the worker, and
## - SIGUSR2: retire the worker.
##
## Note.: using SIGUSR2 with persistence enabled implies the use of an
## initContainer that removes any data the existed previously under
## `concourse.worker.workDir` as the action of `retire`ing a worker implies
## that no state comes back with it when re-registering.
##
## Ref: https://concourse-ci.org/concourse-worker.html
## Ref: https://concourse-ci.org/worker-internals.html
##
shutdownSignal: SIGUSR2
(see shutdownSignal
in charts/stable/concourse
)
At the execution of the preStop
, we wait until the process goes away (or terminationGracePeriod
kicks in sending a SIGKILL), thus, whatever default signal gets set here, it won't affect the chart, but not everyone uses the chart.
Personally, I prefer the explicitness of telling whatever runs my containers to have a specific behavior, rather than putting that behind the image definition.
tl;dr: the Chart would work just fine, but for those who consume the "raw image", it might affect them if they're expecting a SIGTERM
- the default behavior.
I wanted to double check if those things I said are really true hahah, here's a table with the results:
The lower right is what I mentioned 🤔 does that make sense?
Thanks!
@cirocosta If the worker is shutting down all of its containers and volumes will go away with its container, so shouldn't that pretty much always be a retire
? If the worker lands but its containers and volumes are gone when it comes back, the ATC will have invalid entries in its database. Those will eventually go away (thanks to missing_since
), but I think it'd be clearer if the worker retired, cleared out the database, and then came back as a new worker.
Although, for this use case retiring isn't really aggressive enough. Retiring can take a while and exceed the stop timeout. I'm not sure what signal Docker sends after the timeout, but if it's SIGKILL
the worker won't be able to intercept it and delete itself. In this scenario the worker could end up coming back under the same name before the ATC drains and removes it. So that could use some work. :thinking: Maybe we should just have ephemeral workers delete themselves on shutdown, and let people use the concourse retire-worker
command themselves when they want to actually drain.
@cirocosta If the worker is shutting down all of its containers and volumes will go away with its container, so shouldn't that pretty much always be a retire?
Yeah, that's totally valid when having ephemeral containers like a Deployment
without any persistent volumes would have.
For those using StatefulSets
though, the containers came back with:
In that case, the "going away" is more like a landing as it comes back with the "same" state (just volumes, actually). In the case of the Chart, we allow that if the person wants, although the default is the retire
+ volume wiping on start (with https://github.com/concourse/concourse/issues/3600 we could perhaps go directly with "land" workflow for the StatefulSet case 👀 )
(yeah, I'd like a lot more to just go ephemeral in any case, but at the moment, it's way easier to use a StatefulSet as a way of getting separate disks attached to your pod and not worry about filling the machine's default disk or consuming all of its bandwidth).
I'm not sure what signal Docker sends after the timeout, but if it's SIGKILL the worker won't be able to intercept it and delete itself.
yeahh, that's the case in k8s as well - something like this:
preStop
configured:
preStop
terminationGracePeriod
terminationGracePeriod
is hit:
In this scenario the worker could end up coming back under the same name before the ATC drains and removes it.
That's true! Do you think that's more reason for https://github.com/concourse/concourse/issues/3600 ?
Maybe we should just have ephemeral workers delete themselves on shutdown, and let people use the concourse retire-worker command themselves when they want to actually drain.
hmmmm that's interesting - maybe that's something that deserves its own issue :thinking:
Do those answers help? thanks!!
For those using
StatefulSets
though, the containers came back with:
- the same name, and
- the same disk.
Didn't know they came back with the same disk. That seems kind of janky - does Garden just reap the container depot on start, then? I guess keeping the volumes around is nice, it's just weird that only half of the state remains.
That's true! Do you think that's more reason for concourse/concourse#3600 ?
Yep.
hmmmm that's interesting - maybe that's something that deserves its own issue
Yep. We do have this tracked as https://github.com/concourse/concourse/issues/2827 but we never proposed that as a solution. I'll suggest it there.
I have no idea what everyone is talking about. Can I be tagged back in specifically if something for me is here. Thanks
@Lewiscowles1986 LOL. Sorry. I think at this point the change itself is fine, we're just making sure we understand the impact it may have and where it fits on our roadmap.
@cirocosta I'm leaning towards merging this as it feels like a more sensible default for Docker, but it sounds like it'd have impact on those using StatefulSets
. What would have to change in the chart for this? Is it just a documentation thing?
@vito sounds good! No changes actually 😁
For those consuming the Chart, everything is transparent - if they configure a specific signal (which we already default to SIGUSR2
), we'll forward it and do the right thing.
thx!
👍 merging - thanks @Lewiscowles1986!
Should I PR the README.md to remove the note about SIGUSR2?
Dockerfile syntax now supports a STOPSIGNAL directive which AFAIK should make the README.md comment on SIGUSR2 redundant https://docs.docker.com/engine/reference/builder/#stopsignal