mojombo / god

Ruby process monitor
http://godrb.com
MIT License
2.21k stars 536 forks source link

Segmentation fault with Ruby Enterprise Edition 2011.12 #81

Open stonegao opened 12 years ago

stonegao commented 12 years ago

God works perfectly with previous ree versions, we have been running it for almost 2 years without problem.

After upgrading ree to ree-1.8.7-2011.12, got the following error, and god process itself disappears after starting monitored processes.

os is ubuntu Ubuntu 10.04.3 LTS

/usr/local/rvm/rubies/ree-1.8.7-2011.12/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault
ruby 1.8.7 (2011-12-28 MBARI 8/0x6770 on patchlevel 357) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2011.12
jasonroelofs commented 12 years ago

Specifically for me, we upgraded to 2011.12 just fine 32-bit and it was working without any problems. I recently moved to a 64-bit box, 2011.12, and now I'm getting this segfault.

stcatz commented 12 years ago

I've got this problem too, have you found some workaround of this issue ?

dudupeters commented 12 years ago

I'm with the same problem. Any idea?

I [2012-02-24 17:17:16]  INFO: Gateway staging-10000 [ok] process is running (ProcessRunning)
I [2012-02-24 17:17:17]  INFO: Gateway staging-10000 [ok] memory within bounds [38992kb, 38992kb, 38992kb, 38992kb, 38992kb] (MemoryUsage)
I [2012-02-24 17:17:17]  INFO: Gateway staging-10000 [ok] cpu within bounds [1.75055938927828%%, 1.75660823724432%%, 1.74224387723063%%, 1.72631347950894%%, 1.73060544008095%%] (CpuUsage)
I [2012-02-24 17:17:18]  INFO: Gateway staging-10000 [ok] (RestartFileTouched)
I [2012-02-24 17:17:21]  INFO: Gateway staging-10000 [ok] process is running (ProcessRunning)
I [2012-02-24 17:17:22]  INFO: Gateway staging-10000 [ok] memory within bounds [38992kb, 38992kb, 38992kb, 38992kb, 38992kb] (MemoryUsage)
/home/xxxx/.rvm/rubies/ree-1.8.7-2012.01/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault
ruby 1.8.7 (2011-12-28 MBARI 8/0x8770 on patchlevel 357) [i686-linux], MBARI 0x8770, Ruby Enterprise Edition 2012.01
jasonroelofs commented 12 years ago

REE 2012.02 was just released which claims to have a much more stable set of MBARI patches. I've yet to try it but it may fix this segfault issue.

xavier commented 12 years ago

REE 2012.02 doesn't fix the issue on Debian running 2.6.38.2 (x86_64)

jaredonline commented 12 years ago

Same issue with ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02

gbc-pfischer commented 12 years ago

same issue here:

${HOME}/.rvm/rubies/ree-1.8.7-2011.12/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault

ruby version (64bit):

ruby 1.8.7 (2011-12-28 MBARI 8/0x6770 on patchlevel 357) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2011.12

on Scientific Linux release 6.0 (Carbon) with 64bit (SELINUX disabled)

dovadi commented 12 years ago

Same issue here:

/usr/local/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02

on Ubuntu 10.04.4 LTS

jasonroelofs commented 12 years ago

So as it stands, I'm no longer using god at all but have instead switched to Upstart.

ahwatts commented 12 years ago

I'm seeing this issue, too. I have a cron job which runs periodically to check to see if god is running, and it seems to have to restart god every few hours.

ahwatts commented 12 years ago

I'm also on REE 2012.02, on Fedora 8 and 11:

Sometimes I see this message:

/usr/lib/ruby/1.8/monitor.rb:173: [BUG] gc_sweep(): unknown data type 0x0(0x12efe18)
ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02

and ometimes I see this message:

/usr/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault
ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02
sonnysideup commented 12 years ago

Experiencing same issue on Centos 5.4 ( 2.6.21.7-2.fc8xen x86_64 ) using REE 2012.02

andresbravog commented 12 years ago

is anyone debugging the error? so we can try to solve the issue? ...

one solution that may work is to use rvm or another tool to define which ruby you use to execute god, so you can use ruby 1.9.2 for god and ree for the other things.

update: it worked for me to use 1.8 debian system ruby for god, in our linode debian server. ;)

jasonroelofs commented 12 years ago

The only thing I've been able to find is that other libraries are having this same issue:

https://github.com/fastestforward/instrumental_agent/wiki/Using-with-Ruby-Enterprise-Edition

They claim to have fixed it, maybe someone can tease out the fix from their commit log?

cromulus commented 12 years ago

having the same problem. gonna try using a different ruby

donovanbray commented 12 years ago

I'm also having the same issue on 3 boxes, running ubuntu 10.04

/custom/ree/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02

donovanbray commented 12 years ago

we've also seen the gc_sweep error

/custom/ree/lib/ruby/1.8/monitor.rb:173: [BUG] gc_sweep(): unknown data type 0x0(0x34a6458) ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02

hui commented 12 years ago

having the same issue:

hui@ubuntu:/data/god/god_config$ tail -f /var/log/god.log I [2012-07-07 09:00:11] INFO: Syslog enabled. I [2012-07-07 09:00:11] INFO: Using pid file directory: /var/run/god /home/hui/.rvm/rubies/ree-1.8.7-2012.02/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault ruby 1.8.7 (2012-02-08 MBARI 8/0x8770 on patchlevel 358) [i686-linux], MBARI 0x8770, Ruby Enterprise Edition 2012.02

tonybyrne commented 12 years ago

Oh the irony! I've just installed god to keep an eye on a service that periodically segfaults on me and now I find I'm bitten by the REE bug.

donovanbray commented 12 years ago

I haven't been able to fix the issue; but I made a temporary bandaid using upstart. It will at least restart god when it segfaults.

put this (with your correct paths) in /etc/init/god.conf (assuming your on a dist with upstart or have otherwise installed it)

description "Ruby God Monitor"
author "Donovan Bray donnoman@donovanbray.com"

# automatically start
start on (hostname and syslog)

stop on runlevel [016]

# Run before process
pre-start script
  mkdir -p `dirname /var/www/application/shared/log/god.log`
  mkdir -p `dirname /var/www/application/shared/pids/god.pid`
end script

# Essentially lets upstart know the process will detach itself to the background
expect fork
kill timeout 10
respawn

# command to run
exec /opt/ree/bin/god -c /var/www/application/current/config/daemons.god -P /var/www/application/shared/pids/god.pid --log-level info --log /var/www/application/shared/log/god.log

# Run after process (this allows any user to issue god commands; remove it if you don't want it)
post-start script
    sleep 3 && sh -c "chmod 0777 /tmp/god.*.sock;true"
end script
eric commented 12 years ago

I've ran into this as well. I'll see what I can dig up...

zekefast commented 12 years ago

Have the error like this, but I don't use god. Error: $ gem install bundler /home/zekefast/.rvm/rubies/ree-head/lib/ruby/1.8/timeout.rb:60: [BUG] Segmentation fault ruby 1.8.7 (2012-02-08 patchlevel 358) [x86_64-linux]

Same for 2012.02. I use RVM. $ gcc --version gcc-4.7.real (Debian 4.7.1-2) 4.7.1 Copyright (C) 2012 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Able to run ree without this error when install it like this: CC=/usr/bin/gcc-4.4 rvm install ree

Possibly, it could help someone else ...

vakuum commented 12 years ago

Same error on SUSE Linux Enterprise Server 11 SP2 and Ruby Enterprise Edition 2012.02 with god 0.12.1 on x86_64:

...
/opt/erec/ruby-enterprise-1.8.7-2012.02/lib/ruby/1.8/monitor.rb:173: [BUG] Bus Error
ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02
...
/opt/erec/ruby-enterprise-1.8.7-2012.02/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault
ruby 1.8.7 (2012-02-08 MBARI 8/0x6770 on patchlevel 358) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2012.02
...

Maybe this has something to do with an already fixed problem:

Ruby Enterprise Edition 1.8.7-2010.01 released:

Fix a crash bug in the zero-copy context switching patch set

This crash can be reproduced by running "god", which will eventually cause a crash. Aman Gupta has fixed this problem.

Please note that the zero-copy context switching patch set is disabled by default, and must be explicitly enabled by passing -fast-threading to the installer. It is currently still marked as experimental because there are some known issues with the Kernel::fork method. Issue #9.

Ruby Enterprise Edition 1.8.7-2012.02 released:

Experimental zero-copy context switch patch removed

This experimental patch set was never production-ready, so as of this release it has been removed.

bonyiii commented 11 years ago

This worked for me (on opensuse 12.2): http://deadc.org/blog/2012/10/19/rvm-install-ruby-1-dot-8-7-with-gcc-4-dot-7/

rvm remove ree
export CFLAGS="-O2 -fno-tree-dce -fno-optimize-sibling-calls"
rvm install ree
mikhailov commented 11 years ago

any ideas howto fix that bug? should we use just a standard Ruby with God monitoring?

mikhailov commented 11 years ago

fixed by installing non-enterprise Ruby edition