mendicant-original / mentoring

9 stars 1 forks source link

Mentoring Request - Threadsafing #17

Closed mikbe closed 13 years ago

mikbe commented 13 years ago

Name: Mike Bethany
email: mikbe.tk (.-at-.) gmail.com
Availability: 1pm - 9pm EST Mon-Fri, 8am - 10pm EST Sat-Sun
Issue: Threadsafing a mixin module without using self.new or initialize.
Project: Eventable - https://github.com/mikbe/eventable Background: https://github.com/mikbe/eventable/issues/4

Problem Description: I need to threadsafe access to an instance variable, an array. To do so I've wrapped all methods that access the array in a mutex.synchronize block. The problem is I'm creating the mutex in each function using @eventable_mutex ||= Mutex.new which could cause collisions if two threads try to use a method that creates the mutex simultaneously.

I've verified this could happen: https://gist.github.com/1031308

Since this is a mixin module I can't use initialize to create the mutex instance variable and I don't want to use trapself.new because if the class uses this method itself then the instance variable will never be created. My aversion to using self.new to create the instance variable is this: if I redefine a method to fix an issue it's not unreasonable for someone else to do the same thing. If their doing exactly what I did breaks what I did then it's a bad fix. (see https://github.com/mikbe/eventable/pull/3)

So, my current idea for a solution is to create a class level mutex for the inheriting class that locks on the creation of the mutex, for instance: https://gist.github.com/1031419. This seems ugly though and I'm fairly new to Ruby so I'm wondering if I've missed, or simply don't know about, something that would be a better solution.

semmons99 commented 13 years ago

@mikbe have you posted anything about this on the ruby-talk mailing list? I'll poll our alumni and see if they have any insights.

mikbe commented 13 years ago

Nope, didn't ask in Ruby-Talk, didn't even know about it. Is this the one you're talking about? http://www.ruby-lang.org/en/community/mailing-lists/

Just signed up, I'll post there too.

semmons99 commented 13 years ago

Yes. You can also try posting to the RbMU mailing list here.

phiggins commented 13 years ago

In your module, you can define a new #initialize method and alias it so that it gets called instead of the original one. I don't know if this is necessarily the only/best way to do this, just threw something together as a quick example:

module M
  def self.included base
    base.class_eval do
      alias module_m_original_initialize initialize

      def module_m_new_initialize *args, &block
        module_m_original_initialize *args, &block
        @__variable = "foo"
      end

      alias initialize module_m_new_initialize
    end

    def variable
      @__variable
    end
  end
end

class C
  include M
end

p C.new.variable # => "foo"
mikbe commented 13 years ago

Thanks for the try but that doesn't work. When self.included is called nothing else in the including class has happened so the alias doesn't really do anything. The only way I could get it to work is if I could get people to do non-standard behavior and put the include at the bottom of the class. Which of course is not going to happen.

module M
  def self.included base
    puts "module M included"
    base.class_eval do
      alias module_m_original_initialize initialize

      def module_m_new_initialize *args, &block
        module_m_original_initialize *args, &block
        @__variable = "foo"
      end

      alias initialize module_m_new_initialize
    end

    def variable
      @__variable
    end
  end
end

class C
  include M

  def initialize
    puts "class C initialized"
  end

end

p C.new.variable # => nil
afcapel commented 13 years ago

If you need to store the state of a module, remember your module is also an object and you can define attributes on it. For example:

module ThreadSafe

  class << self
    attr_accessor :mutex
  end

  def self.included(base)
    puts "ThreadSafe included"
    ThreadSafe.mutex ||= 'this is a mutex'
  end

  def thread_safe_operation
    puts "calling thread safe operation"

   # Do whatever you want with the mutex
    puts ThreadSafe.mutex
  end
end

class Example
  include ThreadSafe
end

Example.new.thread_safe_operation
mikbe commented 13 years ago

Thanks for the reply. I don't need to store the state of a module I need to store some state information in an inheriting class's instance and I want to make access to that data threadsafe.

I especially don't want to block all instances of all classes that mixin the module using one mutex which unfortunately what your example would do. Expanding on your example.

class Example2
  include ThreadSafe
end

e = Example.new
e2 = Example2.new

e.thread_safe_operation
e2.thread_safe_operation

# => ThreadSafe included
# => ThreadSafe included
# => calling thread safe operation
# => #<Mutex:0x00000100861630>
# => calling thread safe operation
# => #<Mutex:0x00000100861630> <= Same mutex for different instances of different classes

I don't even like what I'm thinking about doing, blocking on the creation of an instance mutex per class. Thanks for the feedback though.

phiggins commented 13 years ago

Dalli, the memcache client, solves this problem in a similar way to @afcapel's solution above: https://github.com/mperham/dalli/blob/d332464e55a84d63fa6edbaa1c6935125ed3cce6/lib/dalli/options.rb

However, that requires the user to perform an extra step, something like:

dalli = Dalli::Server.new
dalli.extend(Dalli::Threadsafe)

@mikbe: I've run into this problem myself in the past and unfortunately I don't think there is a perfect solution. I suggest posting on ruby-talk and see what the experts have to say. I look forward to seeing the discussion!

mikbe commented 13 years ago

@phiggins Cool, thanks for the link, I'll check out their implementation.

For me an ugly behind-the-scenes implementation wins every time over making the developer use an extra step. Especially if that extra step could mean a critical failure if they forget it. Since events are pretty much pointless without threading it's imperative they be threadsafe so I'm not keen on forcing the user-developer to use an extra step every time they use the library.

I greatly appreciate your feedback though, thanks.

mikbe commented 13 years ago

Robert Klemme showed me you can use initialize in a module, I'm still learning, the inheriting class just has to call super if it has its own init method. I have no problem with the extra step here because if the user-developer doesn't use the extra step then there will be an error raised; they can't forget it and have a heisenbug, it just won't work.

 module Foo

  def initialize(*args, &block)
    super
    @mutex = Mutex.new
  end

  def do_stuff
    @mutex.synchronize { puts "you've got mail! yay!"}
  end

end

class Bar
  include Foo

  def initialize
    super
  end

end

class Baz
  include Foo

  def initialize
    puts "I hate mail"
  end

end

b = Bar.new
b.do_stuff

z = Baz.new
z.do_stuff # <= raises error