rmyorston / busybox-w32

WIN32 native port of BusyBox.
https://frippery.org/busybox
Other
697 stars 126 forks source link

Unable to force kill a sleep #473

Closed ale5000-git closed 4 weeks ago

ale5000-git commented 4 weeks ago

Sample code:

my_func()
{
  sleep 120
}

my_func_timeout()
{
  local _pid

  my_func &
  _pid="${!}"

  sleep 1
  echo 1>&2 'Killing...'
  kill -9 "${_pid}" 1>&2 || echo 1>&2 "${?}"
  return 0
}

echo 'Start'
output="$(my_func_timeout)"
echo 'End'

I expected to see "End" after 1 second, but it doesn't work.

rmyorston commented 4 weeks ago

You've now reached the point where the abstractions break down and the implementation details start to leak out.

This has been discussed previously in issue #375.

It sometimes happens that $! isn't the pid of the process you actually wanted to run. Instead it's the pid of an intermediate process.

If you run my_func_timeout directly everything is fine and $! reports the pid of the background process you wanted to kill.

Running it as a command substitution is more complex. If I add a call to ps in the script here's what I get:

~ $ output=$(my_func_timeout)
[1] 2588
PID   PPID  USER     TIME  ELAPSED COMMAND
    0     0 root      0:00 53:11   [System Process]
    4     0 root      0:00 53:11   System
...
 2940  2224 rmy       0:00  0:41   sh -l
 2680  2940 rmy       0:00  0:41   conhost.exe
 3168  2940 rmy       0:00  0:00   sh --fs 00000184
 2588  3168 rmy       0:00  0:00   sh --fs 00000164
 1236  3168 rmy       0:00  0:00   ps
 2876  2588 rmy       0:00  0:00   sleep 120
Killing...  2588
$

The pid reported by $! (2588) is an intermediate shell, an implementation detail. The sleep is a child of that process. Passing 2588 to kill signals the intermediate shell, not the sleep.

I'd also modified the script so it passed the negative of the pid to kill:

    kill -9 "-${_pid}" 1>&2 || echo 1>&2 "${?}"

On Linux using a negative pid signals a process group. In busybox-w32 it approximates this by signalling the given pid and all its children. In the example above the effect of this was to kill process 2588 and its child, 2876, the sleep. Which is (sort of) what was intended.

Whether this would work for you depends on what you actually want to achieve in your application.

ale5000-git commented 4 weeks ago

The my_func run cut (some implementations may freeze forever), so I have to terminate it after some time.

The example with sleep and a negative pid seems to work on busybox-w32 but on bash online it say: main.bash: line 15: kill: (-11684) - No such process

Do you know why it say this?

rmyorston commented 4 weeks ago

I don't know. My guess would be that the background process isn't a process group leader. (Though I must confess my knowledge of such things is minimal.)

ale5000-git commented 4 weeks ago

Thanks for the reply. I guess there is not much that can be done, so I'm closing this.

PS: If there is someone that have an idea feel free to post even if it is closed.