celluloid / celluloid-io

UNMAINTAINED: See celluloid/celluloid#779 - Evented sockets for Celluloid actors
https://celluloid.io
MIT License
879 stars 93 forks source link

Deadlock race condition when linking to actors after opening Socket on JRuby #91

Open jnicklas opened 10 years ago

jnicklas commented 10 years ago

We ran this against Celluloid::IO master, as well as 0.15.0 and 0.14.1 and they all reliably fail on this. When an Actor links to another actor after opening a Celluloid::IO::TCPSocket, the link seems to deadlock sometimes.

Using a regular TCPSocket fixes the problem, so does making a synchronous call to another actor after opening the socket. Linking the other actor before opening the socket is an easy work-around for the bug.

We needed some thread-safe counter, so we use Atomic for that. It is reproducible without it.

It fails on JRuby 1.7.6 and 1.7.4, it seems to succeed on MRI.

require "celluloid/io"
require "atomic"

class A
  include Celluloid::IO

  class B
    include Celluloid
    def foo; end
  end

  def listen
    # B.new_link # this works!
    socket = TCPSocket.new("www.google.com", 80)
    $reference.update { |x| x + 1 }

    # B.new.foo # for some reason, this solves the problem too
    B.new_link # this does not work!
    $counter.update { |x| x + 1 }
  end
end

$counter = Atomic.new(0)
$reference = Atomic.new(0)
100.times { A.new.async.listen }

puts "Wait 5 seconds."
sleep(5)
puts "Total: #{$counter.value} of #{$reference.value}"

Expected output:

Wait 5 seconds.
Total: 100 of 100

Actual output:

Wait 5 seconds.
Total: 33 of 100

/cc @Burgestrand

halorgium commented 10 years ago

Reproducible by me. Pushed it into a gist.