If a dealer issues lots of async messages, the reactor gets deadlocked.
Here are the code snippets to reproduce the issue: router and dealer
Depending on your test box you might have to increase the number of asynchronous requests. In my case ~2000 was sufficient.
A call trace indicating the deadlock:
^CD, [2015-03-15T01:00:00.369225 #9950] DEBUG -- : Terminating 1 actor...
^CW, [2015-03-15T01:00:02.347057 #9950] WARN -- : Terminating task: type=:call, meta={:method_name=>:run}, status=:zmqwait
Celluloid::TaskFiber backtrace unavailable. Please try `Celluloid.task_class = Celluloid::TaskThread` if you need backtraces here.
/home/niam/.gem/ruby/2.2.0/bundler/gems/ffi-rzmq-eba1101e44f1/lib/ffi-rzmq/util.rb:45:in `zmq_strerror': Interrupt
from /home/niam/.gem/ruby/2.2.0/bundler/gems/ffi-rzmq-eba1101e44f1/lib/ffi-rzmq/util.rb:45:in `error_string'
from /projects/celluloid/celluloid-zmq/lib/celluloid/zmq/waker.rb:30:in `block in signal'
from /projects/celluloid/celluloid-zmq/lib/celluloid/zmq/waker.rb:28:in `synchronize'
from /projects/celluloid/celluloid-zmq/lib/celluloid/zmq/waker.rb:28:in `signal'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/evented_mailbox.rb:29:in `<<'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/proxies/actor_proxy.rb:35:in `terminate!'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/proxies/cell_proxy.rb:65:in `terminate!'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/actor_system.rb:72:in `block (2 levels) in shutdown'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/actor_system.rb:70:in `each'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/actor_system.rb:70:in `block in shutdown'
from /usr/lib/ruby/2.2.0/timeout.rb:89:in `block in timeout'
from /usr/lib/ruby/2.2.0/timeout.rb:34:in `block in catch'
from /usr/lib/ruby/2.2.0/timeout.rb:34:in `catch'
from /usr/lib/ruby/2.2.0/timeout.rb:34:in `catch'
from /usr/lib/ruby/2.2.0/timeout.rb:104:in `timeout'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/actor_system.rb:66:in `shutdown'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid.rb:156:in `shutdown'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid.rb:145:in `block in register_shutdown'
/home/niam/.gem/ruby/2.2.0/bundler/gems/ffi-rzmq-eba1101e44f1/lib/ffi-rzmq/util.rb:38:in `zmq_errno': Interrupt
from /home/niam/.gem/ruby/2.2.0/bundler/gems/ffi-rzmq-eba1101e44f1/lib/ffi-rzmq/util.rb:38:in `errno'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/ffi-rzmq-eba1101e44f1/lib/ffi-rzmq/util.rb:45:in `error_string'
from /projects/celluloid/celluloid-zmq/lib/celluloid/zmq/waker.rb:30:in `block in signal'
from /projects/celluloid/celluloid-zmq/lib/celluloid/zmq/waker.rb:28:in `synchronize'
from /projects/celluloid/celluloid-zmq/lib/celluloid/zmq/waker.rb:28:in `signal'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/evented_mailbox.rb:29:in `<<'
from /home/niam/.gem/ruby/2.2.0/bundler/gems/celluloid-21b8c5bd5e65/lib/celluloid/proxies/future_proxy.rb:30:in `method_missing'
from ./zmq-test-dealer.rb:44:in `block in <main>'
from ./zmq-test-dealer.rb:43:in `times'
from ./zmq-test-dealer.rb:43:in `<main>'
@tarcieri @digitalextremist I've created celluloid/celluloid#500 and celluloid/celluloid#501.
If those are considered as a real fix I'll close this issue.
If a dealer issues lots of async messages, the reactor gets deadlocked. Here are the code snippets to reproduce the issue: router and dealer Depending on your test box you might have to increase the number of asynchronous requests. In my case ~2000 was sufficient.
A call trace indicating the deadlock: