kostya / eye

Process monitoring tool. Inspired from Bluepill and God.
MIT License
1.19k stars 86 forks source link

Crash Celluloid::DeadTaskError #129

Closed maxrabin closed 9 years ago

maxrabin commented 9 years ago
03.05.2015 11:53:51 ERROR -- [celluloid] Eye::Process crashed!
Celluloid::DeadTaskError: cannot resume a dead task (can't alloc machine stack to fiber)
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:27:in `rescue in deliver'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:23:in `deliver'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/tasks.rb:98:in `resume'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:418:in `task'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:268:in `block in every'
        /usr/local/share/ruby/gems/2.0/gems/timers-1.1.0/lib/timers.rb:98:in `call'
        /usr/local/share/ruby/gems/2.0/gems/timers-1.1.0/lib/timers.rb:98:in `fire'
        /usr/local/share/ruby/gems/2.0/gems/timers-1.1.0/lib/timers.rb:55:in `fire'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:177:in `run'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:157:in `block in initialize'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/thread_handle.rb:13:in `block in initialize'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `call'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `block in create'
03.05.2015 11:53:51 ERROR -- [celluloid] Eye::Process: ERROR HANDLER CRASHED!
Celluloid::DeadTaskError: cannot resume a dead task (can't alloc machine stack to fiber)
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:27:in `rescue in deliver'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:23:in `deliver'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/tasks.rb:98:in `resume'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:418:in `task'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:386:in `run_finalizer'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:374:in `shutdown'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:367:in `handle_crash'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:187:in `rescue in run'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:171:in `run'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:157:in `block in initialize'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/thread_handle.rb:13:in `block in initialize'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `call'
        /usr/local/share/ruby/gems/2.0/gems/celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `block in create'
kostya commented 9 years ago

never meet this, what eye version, what conditions to reproduce? in celluloid only one issue about it: https://github.com/celluloid/celluloid/issues/442 looks like you have no memory or may be you have too much processes, what the eye x output

maxrabin commented 9 years ago

eye x returns unexpected server response :corrupted_data Eye version: Eye v0.6.4 (c) 2012-2015 @kostya

kostya commented 9 years ago

yes because celluloid die, do eye q i means what is eye x before die.

maxrabin commented 9 years ago
$ eye x
about:     Eye v0.6.4 (c) 2012-2015 @kostya
resources: 14:24, 0%, 28Mb, <1648>
ruby:      ruby 2.0.0p643 (2015-02-25) [x86_64-linux]
gems:      ["Celluloid=0.15.2", "Celluloid::IO=0.15.0", "StateMachine=1.2.0", "NIO=1.1.0", "Timers=1.1.0", "Sigar=1.7.0.0"]
logger:    /media/ephemeral0/log/eye.log
dir:       /home/ec2-user/.eye
pid_path:  /home/ec2-user/.eye/pid
sock_path: /home/ec2-user/.eye/sock
actors:    [["Eye::Process", 23], ["Eye::Utils::CelluloidChain", 16], ["Celluloid::SupervisionGroup", 2], ["Eye::SystemResources::Cache", 1], ["Eye::Group", 1], ["Eye::Server", 1], ["Celluloid::IncidentReporter", 1], ["Celluloid::Notifications::Fanout", 1], ["Eye::Controller", 1]]
kostya commented 9 years ago

may be you have no memory, when this happen? what previous lines in log?

kostya commented 9 years ago

not solved? did your try another ruby?

digitalextremist commented 9 years ago

I would try adding swap to make sure this is truly an out of memory error.

nisanthchunduru commented 7 years ago

@digitalextremist is likely correct

Eye crashed today on our Digital Ocean droplet because it couldn't allocate memory

04.03.2017 20:09:37 INFO  -- [samson:puma] switch :starting [:unmonitored => :starting] start by user
04.03.2017 20:09:37 INFO  -- [samson:puma] executing: `/home/rails/.rbenv/bin/rbenv exec bundle exec puma --environment production --bind tcp://127.0.0.1:3000
 --pidfile /home/rails/apps/samson/tmp/pids/puma.pid --daemon` with start_timeout: 15.0s, start_grace: 15.0s, env: 'RBENV_VERSION=2.3.1' (in /home/rails/apps/
samson)
04.03.2017 20:09:39 INFO  -- [samson:puma] sleeping for :start_grace 15.0
04.03.2017 20:09:54 INFO  -- [samson:puma] load_external_pid_file: process <16369> from pid_file found and running (identity: ok) (puma 3.4.0 (tcp://127.0.0.1
:3000) [samson])
04.03.2017 20:09:54 INFO  -- [samson:puma] switch :started [:starting => :up] start by user
04.03.2017 20:09:54 INFO  -- [samson:puma] <= start
08.03.2017 10:03:55 ERROR -- [celluloid] Actor crashed!
Celluloid::DeadTaskError: cannot resume a dead task (can't alloc machine stack to fiber: Cannot allocate memory)
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/task/fibered.rb:30:in `rescue in deliver'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/task/fibered.rb:26:in `deliver'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/task.rb:83:in `resume'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/actor.rb:341:in `task'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/actor.rb:244:in `block in every'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/timer.rb:98:in `call'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/timer.rb:98:in `fire'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/events.rb:43:in `fire'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/events.rb:81:in `block in fire'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/events.rb:80:in `reverse_each'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/events.rb:80:in `fire'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/group.rb:95:in `fire'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/timers-4.1.2/lib/timers/group.rb:80:in `wait'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/actor.rb:152:in `run'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/actor.rb:131:in `block in start'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-essentials-0.20.5/lib/celluloid/internals/thread_handle.rb:14:in `block in initia
lize'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/actor/system.rb:78:in `block in get_thread'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/group/spawner.rb:50:in `call'
        /home/rails/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/celluloid-0.17.3/lib/celluloid/group/spawner.rb:50:in `block in instantiate'

I found that our droplet didn't have any swap space. Apparently, Digital Ocean droplets don't have any swap space by default.

rails@samson:~$ free -h
              total        used        free      shared  buff/cache   available
Mem:           992M        322M        515M         11M        154M        507M
Swap:            0B          0B          0B

I enabled swap now https://www.digitalocean.com/community/tutorials/how-to-add-swap-space-on-ubuntu-16-04 I think that'll solve the issue

grimm26 commented 7 years ago

Yes, if your node ever runs out of memory, eye will lose its mind. I monitor for "Actor crashed" in my eye log file to swoop in to restart eye if that ever happens.