keylimetoolbox / resque-kubernetes

Run Resque (and ActiveJob) workers as Kubernetes Jobs and autoscale from 0!
MIT License
54 stars 15 forks source link

Ability to kill sidecar container #16

Closed izhilenkov closed 5 years ago

izhilenkov commented 6 years ago

Add ability to kill sidecar container (e.g. cloud sql proxy) within job pod. My suggestion is to add to shutdown? method creating of some file to shared volume path, which could be a signal to kill sidecar.

def shutdown?
 if term_on_empty
   if queues_empty?
      log_with_severity :info, "shutdown: queues are empty"
      shutdown
      File.new "/tmp/pod/main-terminated"
  end
 end

 super
end

Sidecar command

command: ["/bin/sh", "-c"]
args:
- |
/cloud_sql_proxy --dir=/cloudsql -instances=ACCOUNT::REGION:: INSTANCE =tcp:5432 -credential_file=/secrets/cloudsql/credentials.json &
CHILD_PID=$!
(while true; do if [[ -f "/tmp/pod/main-terminated" ]]; then kill $CHILD_PID; echo "Killed $CHILD_PID as the main container terminated."; fi; sleep 1; done) &
wait $CHILD_PID
if [[ -f "/tmp/pod/main-terminated" ]]; then exit 0; echo "Job completed. Exiting..."; fi

volumeMounts for each containers

- mountPath: /tmp/pod
   name: tmp-pod
   readOnly: true

Volumes

- name: tmp-pod
   emptyDir: {}

It would be cool to have some method like

def sidecar_kill
  true
end
jeremywadsack commented 6 years ago

This is a good find. I hadn't considered that you might be running multiple containers within the pod.

This seems to be a more general kubernetes concern — terminating a sidecar when the main job container completes — and is discussed extensively in kubernetes/kubernetes#25908 (which it looks like you've found).

While I'm open to the solution you propose, I'd prefer to limit how much we monkey patch Resque::Worker. In addition I want to make sure that this doesn't require additional configuration for use cases that don't involve a sidecar or don't need this, specifically where File.new "/tmp/pod/main-terminated" would fail because the folder doesn't exist or the location is non-writable.

How are you running your resque worker now? We use a simple bash script to run the job (through runit) and have a script that runs on termination. It seems that it would be simple enough to modify the code that runs your resque worker to write that file after the worker completes. Something like this:

bin/rails environment resque:work & pid="$!"; trap "kill $pid; wait $pid; touch /opt/exit-signals/SIGTERM;" SIGTERM; wait $pid;

Thoughts?

jeremywadsack commented 6 years ago

Another thought is to use an at_exit block for the worker:

Workers can also take advantage of running any code defined using Ruby's at_exit block by setting ENV["RUN_AT_EXIT_HOOKS"]=1. By default, this is turned off. Be advised that setting this value might execute code from gems which register their own at_exit hooks.

That last sentence is disconcerting.

Because we have another use case where we'd like to take an action on a state change for the worker, I'm looking at how to add event notifications to resque itself.

izhilenkov commented 5 years ago

Sorry @jeremywadsack for long response, I totally agree with your position about sidecar pods, the issue could be closed