neo4jrb / activegraph

An active model wrapper for the Neo4j Graph Database for Ruby.
http://neo4jrb.io
MIT License
1.4k stars 276 forks source link

Rules are not thread-safe #136

Closed dnagir closed 12 years ago

dnagir commented 12 years ago

I just got a failing spec:

  1) Neo4j::Node#rule rule node created from concurrent threads
     Failure/Error: Reader.all.to_a.should =~ readers
       expected collection contained:  [#<Reader:0x197a8fa @_java_node=#<#<Class:0x3a7d81>:0xdbb6e9>>]
       actual collection contained:    [#<Reader:0x61cf73 @_java_node=#<#<Class:0x3a7d81>:0xdae687>>, #<Reader:0x1bc1e8d @_java_node=#<#<Class:0x3a7d81>:0x154d7aa>>]
       the extra elements were:        [#<Reader:0x61cf73 @_java_node=#<#<Class:0x3a7d81>:0xdae687>>]
     # ./spec/rule/rule_spec.rb:94:in `(root)'

It hasn't happened before and it's not happening anymore. This is a clear sign of a race condition based on non-thread-safe implementation.

/cc @andreasronge

dnagir commented 12 years ago

I think it makes it easier to reproduce by increasing the number of threads (~50) in the spec.

dnagir commented 12 years ago

It actually gets even worse with higher concurency (hundreds):

ERROR in before commit hook undefined local variable or method `age' for #<#<Class:0xd20ad8>:0x661428>
/Users/dnagir/proj/neo4j/spec/rule/rule_spec.rb:8:in `Reader'
org/jruby/RubyBasicObject.java:1720:in `instance_eval'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/rule/rule.rb:56:in `execute_filter'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/rule/rule_node.rb:145:in `execute_rule'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/rule/rule_node.rb:133:in `execute_rules'
org/jruby/RubyArray.java:1612:in `each'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/rule/rule_node.rb:132:in `execute_rules'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/rule/rule.rb:120:in `trigger_rules'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/rule/event_listener.rb:17:in `on_property_changed'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/event_handler.rb:209:in `property_changed'
org/jruby/RubyArray.java:1612:in `each'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/event_handler.rb:209:in `property_changed'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/event_handler.rb:113:in `beforeCommit'
org/jruby/RubyProc.java:258:in `call'
/Users/dnagir/.rvm/rubies/jruby-1.6.5.1/lib/ruby/site_ruby/shared/builtin/java/java.util.rb:7:in `each'
/Users/dnagir/proj/neo4j/spec/../lib/neo4j/event_handler.rb:113:in `beforeCommit'
/Users/dnagir/proj/neo4j/spec/spec_helper.rb:36:in `finish_tx'
/Users/dnagir/proj/neo4j/spec/rule/rule_spec.rb:90:in `(root)'
Jan 17, 2012 10:01:09 AM org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog close
INFO: Close invoked with 91 running transaction(s). 
Jan 17, 2012 10:01:09 AM org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog close
INFO: Dirty log: /private/var/folders/yk/79fg_v953v16m0hk_hzfjfk00000gn/T/neo4j-rspec-tests/nioneo_logical.log.1 now closed. Recovery will be started automatically next time it is opened.
Jan 17, 2012 10:01:09 AM org.neo4j.kernel.impl.transaction.TxManager commit
SEVERE: Commit failed, status=STATUS_COMMITING, errorCode=0
javax.transaction.xa.XAException: Logical log unable to mark 1P-commit [48] 
    at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.commitOnePhase(XaLogicalLog.java:351)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:435)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:541)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:740)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:702)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:117)
    at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:118)
    at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:508)
    at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:368)
    at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:50)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:133)
    at rubyjit.finish_tx_A5AAF15A7D09A388AA04107C4781783E6BDDD50C.__file__(/Users/dnagir/proj/neo4j/spec/spec_helper.rb:36)
    at rubyjit.finish_tx_A5AAF15A7D09A388AA04107C4781783E6BDDD50C.__file__(/Users/dnagir/proj/neo4j/spec/spec_helper.rb)
    at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:127)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:133)
    at org.jruby.ast.VCallNode.interpret(VCallNode.java:86)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)
    at org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:203)
    at org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:191)
    at org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:122)
    at org.jruby.runtime.Block.call(Block.java:89)
    at org.jruby.RubyProc.call(RubyProc.java:270)
    at org.jruby.RubyProc.call(RubyProc.java:224)
    at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:95)
    at java.lang.Thread.run(Thread.java:680)
Caused by: java.nio.channels.ClosedChannelException
    at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
    at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:629)
    at org.neo4j.kernel.impl.transaction.xaframework.DirectMappedLogBuffer.writeOut(DirectMappedLogBuffer.java:154)
    at org.neo4j.kernel.impl.transaction.xaframework.DirectMappedLogBuffer.force(DirectMappedLogBuffer.java:161)
    at org.neo4j.kernel.impl.transaction.xaframework.ForceMode$1.force(ForceMode.java:31)
    at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.commitOnePhase(XaLogicalLog.java:346)
    ... 30 more
Jan 17, 2012 10:01:09 AM org.neo4j.kernel.impl.transaction.TxManager commit
SEVERE: Unable to rollback transaction. Some resources may be commited others not. Neo4j kernel should be SHUTDOWN for resource maintance and transaction recovery ---->
java.lang.IllegalStateException: Generator closed /private/var/folders/yk/79fg_v953v16m0hk_hzfjfk00000gn/T/neo4j-rspec-tests/neostore.propertystore.db.index.keys.id
    at org.neo4j.kernel.impl.nioneo.store.IdGeneratorImpl.freeId(IdGeneratorImpl.java:285)
    at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.freeId(CommonAbstractStore.java:402)
    at org.neo4j.kernel.impl.nioneo.store.AbstractDynamicStore.freeBlockId(AbstractDynamicStore.java:223)
    at org.neo4j.kernel.impl.nioneo.store.AbstractNameStore.freeBlockId(AbstractNameStore.java:99)
    at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.doRollback(WriteTransaction.java:331)
    at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.rollback(XaTransaction.java:278)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.rollback(XaResourceManager.java:513)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.rollback(XaResourceHelpImpl.java:111)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doRollback(TransactionImpl.java:558)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:765)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:702)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:117)
    at org.neo4j.kernel.TopLevelTransaction.finish(TopLevelTransaction.java:118)
    at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:508)
    at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:368)
    at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:50)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:133)
    at rubyjit.finish_tx_A5AAF15A7D09A388AA04107C4781783E6BDDD50C.__file__(/Users/dnagir/proj/neo4j/spec/spec_helper.rb:36)
    at rubyjit.finish_tx_A5AAF15A7D09A388AA04107C4781783E6BDDD50C.__file__(/Users/dnagir/proj/neo4j/spec/spec_helper.rb)
    at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:127)
    at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:133)
    at org.jruby.ast.VCallNode.interpret(VCallNode.java:86)
    at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
    at org.jruby.ast.BlockNode.interpret(BlockNode.java:71)
    at org.jruby.evaluator.ASTInterpreter.INTERPRET_BLOCK(ASTInterpreter.java:112)
    at org.jruby.runtime.Interpreted19Block.evalBlockBody(Interpreted19Block.java:203)
    at org.jruby.runtime.Interpreted19Block.yield(Interpreted19Block.java:191)
    at org.jruby.runtime.Interpreted19Block.call(Interpreted19Block.java:122)
    at org.jruby.runtime.Block.call(Block.java:89)
    at org.jruby.RubyProc.call(RubyProc.java:270)
    at org.jruby.RubyProc.call(RubyProc.java:224)
    at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:95)
    at java.lang.Thread.run(Thread.java:680)
andreasronge commented 12 years ago

Interesting, have not seen that stack trace before. Actually the rspec (the shared state readers) itself is not thread safe, see http://www.engineyard.com/blog/2011/concurrency-in-jruby/ But there seems to be a concurrency problem with Rules, I will investigate

andreasronge commented 12 years ago

I've tried to reproduce this running it 1000 times, and it still works:

 1000.times do
      size = Reader.all.size
      threads = 50.times.collect do
        Thread.new do
          Neo4j.threadlocal_ref_node = nil
          tx = Neo4j::Transaction.new
          Reader.new(:age => 2)
          tx.success
          tx.finish
        end
      end
      threads.each(&:join)
      size += 50
      Reader.all.size.should == size
end