Server doesn't crash if actor crashes

jjb commented 8 years ago

Given this code:

RubyDNS::run_server(:listen => INTERFACES) do
  otherwise do |transaction|
    transaction.respond!("1.2.3.4")
  end
end

and this query:

nslookup -type=cname www.example.com

This error happens:

I, [2016-08-12T17:08:17.284440 #2669]  INFO -- : Starting RubyDNS server (v1.0.3)...
I, [2016-08-12T17:08:17.284762 #2669]  INFO -- : <> Listening on udp:0.0.0.0:5300
I, [2016-08-12T17:08:17.285454 #2669]  INFO -- : <> Listening on tcp:0.0.0.0:5300
D, [2016-08-12T17:08:25.188498 #2669] DEBUG -- : <> Receiving incoming query (33 bytes) to RubyDNS::UDPHandler...
D, [2016-08-12T17:08:25.188775 #2669] DEBUG -- : <7773> Processing question www.example.com Resolv::DNS::Resource::IN::CNAME...
D, [2016-08-12T17:08:25.188804 #2669] DEBUG -- : <7773> Searching for www.example.com Resolv::DNS::Resource::IN::CNAME
I, [2016-08-12T17:08:25.188821 #2669]  INFO -- : Resource class: Resolv::DNS::Resource::IN::CNAME
I, [2016-08-12T17:08:25.188844 #2669]  INFO -- : Resource: #<Resolv::DNS::Resource::IN::CNAME:0x00000002149888 @name="1.2.3.4">
D, [2016-08-12T17:08:25.188869 #2669] DEBUG -- : add_answer: #<Resolv::DNS::Resource::IN::CNAME:0x00000002149888 @name="1.2.3.4"> 5 1
D, [2016-08-12T17:08:25.188913 #2669] DEBUG -- : <7773> Time to process request: 0.000283347s
E, [2016-08-12T17:08:25.189063 #2669] ERROR -- : Actor crashed!
NoMethodError: undefined method `to_a' for "1.2.3.4":String
  /usr/lib/ruby/2.3.0/resolv.rb:1469:in `put_name'
  /usr/lib/ruby/2.3.0/resolv.rb:1769:in `encode_rdata'
  /usr/lib/ruby/2.3.0/resolv.rb:1423:in `block (4 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1452:in `put_length16'
  /usr/lib/ruby/2.3.0/resolv.rb:1423:in `block (3 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1419:in `each'
  /usr/lib/ruby/2.3.0/resolv.rb:1419:in `block (2 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1418:in `each'
  /usr/lib/ruby/2.3.0/resolv.rb:1418:in `block in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1433:in `initialize'
  /usr/lib/ruby/2.3.0/resolv.rb:1399:in `new'
  /usr/lib/ruby/2.3.0/resolv.rb:1399:in `encode'
  /var/lib/gems/2.3.0/gems/rubydns-1.0.3/lib/rubydns/handler.rb:89:in `respond'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122:in `dispatch'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'
W, [2016-08-12T17:08:25.189276 #2669]  WARN -- : Terminating task: type=:call, meta={:method_name=>:run}, status=:iowait
  Celluloid::TaskFiber backtrace unavailable. Please try `Celluloid.task_class = Celluloid::TaskThread` if you need backtraces here.
E, [2016-08-12T17:08:25.189326 #2669] ERROR -- : Actor crashed!
NoMethodError: undefined method `to_a' for "1.2.3.4":String
  /usr/lib/ruby/2.3.0/resolv.rb:1469:in `put_name'
  /usr/lib/ruby/2.3.0/resolv.rb:1769:in `encode_rdata'
  /usr/lib/ruby/2.3.0/resolv.rb:1423:in `block (4 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1452:in `put_length16'
  /usr/lib/ruby/2.3.0/resolv.rb:1423:in `block (3 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1419:in `each'
  /usr/lib/ruby/2.3.0/resolv.rb:1419:in `block (2 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1418:in `each'
  /usr/lib/ruby/2.3.0/resolv.rb:1418:in `block in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1433:in `initialize'
  /usr/lib/ruby/2.3.0/resolv.rb:1399:in `new'
  /usr/lib/ruby/2.3.0/resolv.rb:1399:in `encode'
  /var/lib/gems/2.3.0/gems/rubydns-1.0.3/lib/rubydns/handler.rb:89:in `respond'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122:in `dispatch'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'
E, [2016-08-12T17:08:25.192706 #2669] ERROR -- : Actor crashed!
NoMethodError: undefined method `to_a' for "1.2.3.4":String
  /usr/lib/ruby/2.3.0/resolv.rb:1469:in `put_name'
  /usr/lib/ruby/2.3.0/resolv.rb:1769:in `encode_rdata'
  /usr/lib/ruby/2.3.0/resolv.rb:1423:in `block (4 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1452:in `put_length16'
  /usr/lib/ruby/2.3.0/resolv.rb:1423:in `block (3 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1419:in `each'
  /usr/lib/ruby/2.3.0/resolv.rb:1419:in `block (2 levels) in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1418:in `each'
  /usr/lib/ruby/2.3.0/resolv.rb:1418:in `block in encode'
  /usr/lib/ruby/2.3.0/resolv.rb:1433:in `initialize'
  /usr/lib/ruby/2.3.0/resolv.rb:1399:in `new'
  /usr/lib/ruby/2.3.0/resolv.rb:1399:in `encode'
  /var/lib/gems/2.3.0/gems/rubydns-1.0.3/lib/rubydns/handler.rb:89:in `respond'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122:in `dispatch'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'
  /var/lib/gems/2.3.0/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'
W, [2016-08-12T17:08:25.195680 #2669]  WARN -- : Terminating task: type=:call, meta={:method_name=>:run}, status=:iowait
  Celluloid::TaskFiber backtrace unavailable. Please try `Celluloid.task_class = Celluloid::TaskThread` if you need backtraces here.

The actors are permanently crashed, but the error isn't raised up to the main ruby process, so that process continues on, but is unable to process additional requests.

Is there a way to have crashed actors result in an exception raised to the outer process?

I'm using rubydns, but I figure this is a celluloid-dns problem. Let me know if that's not the case.

jjb commented 8 years ago

@ioquatix @tarcieri thoughts?

eterry1388 commented 8 years ago

I have ran into this too. My workaround was to rescue Celluloid::DeadActorError and kill the process myself. systemd would start it back up for me. Otherwise you would end up in an endless loop of "Actor crashed".

jjb commented 8 years ago

@eterry1388 Thank you so much for chiming in. Could you provide some example code showing your technique?

eterry1388 commented 8 years ago

It's more of a hack than a solution. Here is some code:

require 'celluloid/dns'

class TestServer < Celluloid::DNS::Server
  def process( name, resource_class, transaction )
    @resolver ||= Celluloid::DNS::Resolver.new( [[:udp, '8.8.8.8', 53], [:tcp, '8.8.8.8', 53]] )
    transaction.passthrough!( @resolver )

  rescue Celluloid::DeadActorError => e
    puts 'Hit exception! Celluloid::DeadActorError. Killing process...'
    puts name
    puts e
    puts e.backtrace
    `killall -9 ruby`
  end
end

server = TestServer.new( listen: [[:udp, '0.0.0.0', 53]] )
server.run

sleep

After reading this again, it looks like he was using rubydns rather than celluloid-dns. Should I open a new bug for my issue?

jjb commented 8 years ago

@eterry1388 I am He. rubydns is implemented with celluloid-dns and I thinks share the identical problem.

ioquatix commented 8 years ago

Sorry guys I missed all this for some reason. I'll take a look.

celluloid / celluloid-dns

Server doesn't crash if actor crashes #12