hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.25k stars 4.41k forks source link

Child process signal forwarding not reliably working for consul watch #3572

Open preetapan opened 6 years ago

preetapan commented 6 years ago

We cleaned this up in #2999. When consul watch commands are run with a handler script, the handler script is a child of the parent consul watch process. However, sending -KILL/-INT to the parent doesn't seem to forward it to the child reliabily. After trying this a number of times, I only saw it successfully forward it once.

Reproduction steps:

1) Bash script that traps signals:

#!/bin/bash
sigint()
{
   echo "signal INT received" 
   exit 0
}

sigkill()
{
   echo "signal KILL received" 
   exit 0
}
trap 'sigint'  INT
trap 'sigkill' KILL

echo "PID is $$"
while true
do
  sleep 5
done

2) Run consul watch: consul watch -type key -key test ./test.sh

3) send INT or KILL to the consul watch pid, it doesn't cause test.sh to exit. test.sh still shows up in ps output.

Sending INT or KILL to test.sh's PID works as expected and terminates the script.

Seems like there is a narrow window in which we forward this and that races with the consul watch process processing the INT/KILL and stopping itself.

slackpad commented 6 years ago

Tagging this to 1.0 for triage. I'm thinking we don't make any more changes for this at the current time, and that we kick this forward to look deeper and to add a wider set of signals we forward for non-Windows platforms.

slackpad commented 6 years ago

Expanding to more signals is captured on https://github.com/hashicorp/consul/issues/3571.

slackpad commented 6 years ago

Taking this off 1.0 - we should tackle this along with #3571.