alco / porcelain

Work with external processes like a boss
http://hexdocs.pm/porcelain
MIT License
939 stars 45 forks source link

Investigate ways to terminate external processes cleanly (with and without goon) #13

Open alco opened 9 years ago

alco commented 9 years ago

The current behaviour of stopping a process is not satisfactory no matter how you slice it.

Without goon

Below, we have an error in the stream name which happens in the spawned process that controls the Erlang port.

iex(1)> p = Porcelain.spawn_shell "ping google.com", out: IO.stream(:stdout, :line)
%Porcelain.Process{err: nil,
 out: %IO.Stream{device: :stdout, line_or_bytes: :line, raw: false},
 pid: #PID<0.74.0>}
iex(2)>
=ERROR REPORT==== 20-Jan-2015::00:14:52 ===
Error in process <0.78.0> with exit value: {badarg,[{io,put_chars,[stdout,unicode,<<112 bytes>>],[]},{'Elixir.Enum','-reduce/3-fun-0-',3,[{file,"lib/enum.ex"},{line,1266}]},{'Elixir.Stream',do_unfold,4,[{file,"lib/stream.ex"},{line,1126}]},{'Elixir.Enum',reduce,3,[{file,"lib/enum.ex"},{line,1265}]},{...

iex(3)> Porcelain.Process.alive? p
true
iex(4)> Porcelain.Process.stop p

# the shell just hangs
# the external process 'ping' remain alive even after terminating the VM

An example of successfully stopping a port:

iex(1)> p = Porcelain.spawn_shell "ping google.com", out: IO.binstream(:stdio, :line)
%Porcelain.Process{err: nil,
 out: %IO.Stream{device: :standard_io, line_or_bytes: :line, raw: true},
 pid: #PID<0.73.0>}
PING google.com (173.194.113.193): 56 data bytes
64 bytes from 173.194.113.193: icmp_seq=0 ttl=57 time=11.042 ms
...
iex(2)> Porcelain.Process.stop p
true

We don't get any more input, but ping keeps running in the background.

With goon

iex(1)> p = Porcelain.spawn_shell "ping google.com", out: IO.binstream(:stdio, :line)
%Porcelain.Process{err: nil,
 out: %IO.Stream{device: :standard_io, line_or_bytes: :line, raw: true},
 pid: #PID<0.74.0>}
PING google.com (173.194.113.194): 56 data bytes
64 bytes from 173.194.113.194: icmp_seq=0 ttl=57 time=8.044 ms
...
iex(2)> Porcelain.Process.stop p
true
iex(3)> panic: write /dev/stdout: broken pipe

                                             goroutine 3 [running]:
                                                                   runtime.panic(0xa4ba0, 0x2102a5420)
                                                                                                        /usr/local/Cellar/go/1.2.2/libexec/src/pkg/runtime/panic.c:266 +0xb6
     log.(*Logger).Panicf(0x2102a6190, 0xde260, 0x3, 0x221040fe30, 0x1, ...)
                                                                                /usr/local/Cellar/go/1.2.2/libexec/src/pkg/log/log.go:200 +0xbd
                                                                                                                                               main.fatal_if(0xc2840, 0x2102bf7e0)
            /Users/alco/extra/goworkspace/src/goon/util.go:38 +0x17e
                                                                        main.outLoop(0x257338, 0x2102860e8, 0x256fe8, 0x210286008, 0x0, ...)
                                                                                                                                                /Users/alco/extra/goworkspace/src/goon/io.go:151 +0x44a
                                created by main.wrapStdout
                                                            /Users/alco/extra/goworkspace/src/goon/io.go:34 +0x16a

                                                                                                                      goroutine 1 [chan receive]:
                                                                                                                                                 main.proto_2_0(0x7fff5fbf0100, 0xe3fc0, 0x3, 0xde7a0, 0x1, ...)
                                            /Users/alco/extra/goworkspace/src/goon/proto_2_0.go:58 +0x3a3
                                                                                                             main.main()
                                                                                                                            /Users/alco/extra/goworkspace/src/goon/main.go:51 +0x3b6

ping terminates, but goon panics.

manukall commented 9 years ago

Hey alco, any news on this? For me a stopping a process doesn't even work with goon. If it matters, the process is node.js and it's started with spawn_shell.

alco commented 9 years ago

Going to look at fixing this in goon tonight.

Sorry for the wait @manukall. Is this still relevant to you?

manukall commented 9 years ago

i'm not working on that project anymore. thanks for looking into it, though.

ericmj commented 9 years ago

I need this for testing https://github.com/hexpm/hex, during testing we need to do API calls to the server https://github.com/hexpm/hex_web. We do this by starting the API server with a port (or porcelain), the problem is if the VM that runs the hex tests stops unexpectedly the hex_web process keeps running.

EDIT: Actually the hex_web child process is always orphaned after the parent VM terminates.

jschneider1207 commented 8 years ago

I'm having the same problem as @ericmj. I'm hosting an http server with IIS Express via porcelain and the server does not get terminated when the beam vm shuts down.

alexlafroscia commented 8 years ago

I've been trying to write a plug that interacts with the Ember CLI, and have been seeing the same problem. After the Elixir application shuts down the Node.js process keeps running.

aphillipo commented 8 years ago

Could this be done with a separate OTP application (maybe that runs at a system level and is never killed) that registers external processes and os kills them under some conditions (e.g. when the OTP app that started the process ends)?

darkseas commented 8 years ago

Not just node apps. I'm testing this with a small python bottle server with the same issues. Neither stop or signal stops the underlying web server. OSX, Python 3, using goon. Would a minimal example be helpful?

peter-fogg commented 8 years ago

I'm also interested in this issue; it's causing some messiness in a Mix task I use to run tests.

alco commented 8 years ago

Hey folks! Thanks for the feedback. This is definitely an important issue. I'm hoping to have some time to work on this soon.

@peter-fogg Could you provide more details about the problem you're having? What is the result you're getting and how it's different from the expected one?

Thanks everyone for bearing with me!

peter-fogg commented 8 years ago

@alco Sure -- the short version is that I'm using Porcelain to coordinate some external servers during tests. We have our Phoenix server running against a mocked-out backend API server, and once that's running we run some tests against the Phoenix server. The gist of the tests is:

  1. Start Phoenix and API with Porcelain.spawn_shell(command, in: :receive, out: {:send, self()}, err: {:send, self()})
  2. Listen to both processes and wait for them to be ready to accept HTTP requests
  3. Start test process with System.cmd
  4. When test process is finished, shut down both servers with Porcelain.Process.signal proc, :int
  5. Exit with status of the test process

This all works, but I get a panic from Goon sometimes, but not all the time. It also seems to occasionally leave one of the server processes running as an orphan, requiring me to kill it manually before I can run the tests again (since it's using a certain port which will be required for the next time test run).

Let me know if you need some more info. Thanks!

gridbox commented 7 years ago

I had a similar requirement to stop spawned, interactive Docker containers when the parent Elixir process aborted. I now also have the requirement for arbitrary scripts I execute. Here is a quick and dirty way I met the requirement in Linux:

  1. Create a named pipe mkfifo /home/user/parent_signal_1 (can be named anything of meaning but should be unique for each instance of a child process)
  2. Create a bash script to start child command and watch the named pipe for EOF:
    #!/bin/bash
    # Start passed in command in the background
    $2 &
    CHILD_PID=$!
    # Watch named pipe passed in (This will hang until EOF received from Elixir process)
    cat $1
    # Send signal of choice to child process
    kill -s SIGKILL $CHILD_PID
  3. In Elixir application, open an erlang port on the named pipe and start child process with bash script via Porcelain. I use sleep 60 in this example but that could be any script. It's important to use Process.link to link to the Porcelain pid so the process which opened the erlang port will send the EOF to kill the child in the event the Porcelain process itself aborts.
    
    _port = "/home/user/parent_signal_1" 
        |> String.to_charlist 
        |> :erlang.open_port([:eof])

%Porcelain.Process{pid: pid} = "/home/user/start.sh /home/user/parent_signal_1 'sleep 60'" |> Porcelain.spawn_shell(in: :receive, out: {:send, self()}, err: :out, result: :discard)

Process.link(pid)


That's it!  The Docker container setup was slightly more involved and used `dumb-init` in order to kill PID 1 but the basic concept is the same.  I doubt my approach can be codified into the official approach but it has proven useful to me in the interim.  Hopefully this provides some use to someone else to avoid zombie processes.
cookkkie commented 7 years ago

@gridbox If you child process finishes your process will hang. So you're fixing one way but breaking the other way.

adleroliveira commented 7 years ago

Any updates on this issue?

tallakt commented 6 years ago

I am also fighting lingering processes, with nothing in the docs to describe how to handle this. I would appreciate any update on this issue also

sailxjx commented 6 years ago

Is this helpful? https://hexdocs.pm/elixir/Port.html#module-zombie-processes

Hope porcelain can use this wrap in basic driver.

sailxjx commented 6 years ago

Finally, after tried every way in porcelain/os/port and something else, I gave up.

I wrote a script called kill_goon.sh to kill all the orphan processes spawned by porcelain, and I will call this script in the end of my task flow:

#!/usr/bin/env sh
goon_pids=($(ps -e | grep goon | grep -v grep | grep -v kill_goon | awk '{print $1}'))
for pid in "${goon_pids[@]}"
do
  pgrep -P $pid | xargs kill
done