celluloid / celluloid-redis

UNMAINTAINED: See celluloid/celluloid#779
MIT License
50 stars 9 forks source link

deadlock due to apparent interaction with sleep #13

Open jhoblitt opened 10 years ago

jhoblitt commented 10 years ago

As a disclaimer, I've never worked with celluloid/celluloid-io before today but I can't find any mention of sleep being considered unsafe inside of an actor. Everything was going well with my experiments until I started encountering strange deadlocks. I managed to reduced the problem down to this minimal test case which seems to show some sort of odd interaction between sleep/timers running in the same method as a redis method invocation, but only if the redis method is called before the sleep.

#!/usr/bin/env ruby

require 'celluloid/io'
require 'celluloid/redis'
require 'redis/connection/celluloid'

class RedisTest
  include Celluloid::IO

  def initialize
    @redis = ::Redis.new
  end

  def lookup_redis
    puts "redis lookup"
    @redis.lindex 'fooq', 0
    puts "redis returned\n\n"
  end

  def yawn
    puts "going to sleep"
    sleep 1
    puts "wokeup\n\n"
  end

  def yawn_and_lookup_redis
    puts "going to sleep"
    sleep 1
    puts "wokeup"
    puts "redis lookup"
    record = @redis.lindex 'fooq', 0
    puts "redis returned\n\n"
  end

  def lookup_redis_and_yawn
    puts "redis lookup"
    record = @redis.lindex 'fooq', 0
    puts "redis returned"
    puts "going to sleep"
    sleep 1
    puts "wokeup\n\n"
  end

end

rt = RedisTest.new
rt.lookup_redis
rt.yawn
rt.yawn_and_lookup_redis
rt.lookup_redis_and_yawn

The output I get is:

redis lookup
redis returned

going to sleep
wokeup

going to sleep
wokeup
redis lookup
redis returned

redis lookup
redis returned
going to sleep
<hangs>

If I comment out the line require 'redis/connection/celluloid', so that redis uses it's default connection driver, the script exits normally with the expected output.

tarcieri commented 10 years ago

Celluloid provides its own sleep that interacts with its internal timer system.

That said, this is the likely culprit:

https://github.com/celluloid/celluloid-io/issues/56

jhoblitt commented 10 years ago

Well that's a bummer but a bit of relief. I've been having flash backs of debugging pthreads + signals for the last 4 hours.

tarcieri commented 10 years ago

It's something we'd like to get fixed really soon but requires some changes to the design of the Mailbox API which is why it's still not fixed.

@halorgium we should get this fixed really soon now! :soon: