mojombo / god

Ruby process monitor
http://godrb.com
MIT License
2.21k stars 543 forks source link

God will not terminate cleanly on FreeBSD 10 #158

Open rtyler opened 10 years ago

rtyler commented 10 years ago

I'm running FreeBSD 10.0-STABLE with God 0.13.3 and I'm unable to cleanly invoke terminate.

The behavior is that god terminate hangs, god itself shuts down the sub-processes but then fails to exit itself.

I've been able to run and terminate God from the source tree with bundler using:

  gem 'god', :path => '~/source/github/gems/god'

This unfortunately doesn't make much sense to me. I've spent at least a day trying to debug and code-spelunk and nothing has come of it.

The god terminate process has the following backtrace when I interrupt it:

......................................................^CUncaught exception

/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:566:in `read'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:566:in `load'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:632:in `recv_reply'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:918:in `recv_reply'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1197:in `send_message'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1088:in `block (2 levels) in method_missing'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1172:in `open'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1087:in `block in method_missing'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1105:in `with_friend'
/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1086:in `method_missing'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/cli/command.rb:183:in `terminate_command'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/cli/command.rb:30:in `dispatch'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/cli/command.rb:10:in `initialize'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/bin/god:121:in `new'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/bin/god:121:in `<top (required)>'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/bin/god:19:in `load'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/bin/god:19:in `<main>'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/bin/ruby_executable_hooks:15:in `eval'
/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/bin/ruby_executable_hooks:15:in `<main>'

I've also added some code to dump all the currently running threads every 10 seconds, which gives me:

#<Thread:0x000008021b81b8 sleep>
["/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god.rb:723:in `join'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god.rb:736:in `at_exit'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god.rb:770:in `block in <top (required)>'"]

#<Thread:0x00000804ab1498 sleep>
["/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/event_handlers/kqueue_handler.rb:13:in `block in handle_events'", "/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/timeout.rb:69:in `timeout'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/event_handlers/kqueue_handler.rb:12:in `handle_events'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/event_handler.rb:62:in `block (2 levels) in start'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/event_handler.rb:60:in `loop'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/event_handler.rb:60:in `block in start'"]

#<Thread:0x00000804ab8ce8 run>
["/usr/home/tyler/source/lookout/git/bluffdale/bluffdale.god:10:in `block (4 levels) in <top (required)>'", "/usr/home/tyler/source/lookout/git/bluffdale/bluffdale.god:9:in `each'", "/usr/home/tyler/source/lookout/git/bluffdale/bluffdale.god:9:in `block (3 levels) in <top (required)>'", "/usr/home/tyler/source/lookout/git/bluffdale/bluffdale.god:8:in `open'", "/usr/home/tyler/source/lookout/git/bluffdale/bluffdale.god:8:in `block (2 levels) in <top (required)>'", "/usr/home/tyler/source/lookout/git/bluffdale/bluffdale.god:7:in `loop'", "/usr/home/tyler/source/lookout/git/bluffdale/bluffdale.god:7:in `block in <top (required)>'"]

#<Thread:0x00000804abd590 sleep>
["/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/thread.rb:71:in `wait'", "/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/monitor.rb:110:in `wait'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/driver.rb:120:in `block in pop'", "/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/driver.rb:117:in `pop'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/driver.rb:181:in `block (2 levels) in initialize'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/driver.rb:179:in `loop'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god/driver.rb:179:in `block in initialize'"]

#<Thread:0x000008021aa540 sleep>
["/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/unix.rb:98:in `accept'", "/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1574:in `main_loop'", "/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/drb/drb.rb:1424:in `block in run'"]

#<Thread:0x000008021a97a8 sleep>
["/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god.rb:714:in `block (2 levels) in start'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god.rb:713:in `loop'", "/home/tyler/.rvm/gems/ruby-1.9.3-p484@rubygems/gems/god-0.13.3/lib/god.rb:713:in `block in start'"]

#<Thread:0x00000802984c98 sleep>
["/home/tyler/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/timeout.rb:62:in `block in timeout'"]

uname -a spits out: FreeBSD mango 10.0-STABLE FreeBSD 10.0-STABLE #9 r261719: Mon Feb 10 16:28:30 PST 2014 root@mango:/usr/obj/usr/src/sys/GENERIC amd64 (fwiw)

skull-squadron commented 9 years ago

We just encountered similar behavior too, but more importantly, respawning on FreeBSD appears broken ( #210 )

Workaround: god stop; god terminate

(on god 0.13.5 / MRI 2.2.0 / FreeBSD 10.1-RELEASE-p4)

Seems to Do The Right Thing :tm:

Thesephi commented 9 years ago

When I manage god via an init script (as discussed in this post), I run into kind of an 'opposite' problem. In my case, god fails to exit all sub-processes (watches), but successfully shuts itself down (after a 10-second timeout).

Calling god terminate just doesn't cut it. god stop; god terminate comes to the rescue. Thanks @steakknife for pointing this out!

skull-squadron commented 9 years ago

Np. I donkey-punched God somewhere in a commit in foreman_god upstream to 'make it just work'TM. You can try to add that extra bit (a '<<' one-liner), if you can apply patches. Mod downside: if God dies, it might nondetermisitically take your $$$ shopping cart or triple-secret hft strats to hell (which, I hear, is very nice this time of year) with it. Use-case

On Wednesday, September 16, 2015, Khang Dinh notifications@github.com wrote:

When I manage god via an init script (as discussed in this post http://www.synbioz.com/blog/monitoring_server_processes_with_god), I run into kind of an 'opposite' problem. In my case, god fails to exit all sub-processes (watches), but successfully shuts itself down (after a 10-second timeout).

Calling god terminate just doesn't cut it. god stop; god terminate comes to the rescue. Thanks @steakknife https://github.com/steakknife for pointing this out!

— Reply to this email directly or view it on GitHub https://github.com/mojombo/god/issues/158#issuecomment-140951336.