neo4jrb / activegraph

An active model wrapper for the Neo4j Graph Database for Ruby.
http://neo4jrb.io
MIT License
1.4k stars 276 forks source link

Possible issue with thread safety? #1620

Open mperice opened 3 years ago

mperice commented 3 years ago

We use a sidekiq process with 3 threads in order to synchronize between postgresql and neo4j DB. Until migrating to v10 this worked flawlessly, but now, it seems that whenever we perform multiple concurrent 'merge' writes (to same table?) connection just somehow hangs indefinitely. If I run the same code with 3 sidekiq processes with a single thread, it works as expected.

I have tried to find a small code sample that would help to debug this issue. This is what I came up:

Code example (inline, gist, or repo)

This works:

Parallel.each((1..3000), in_processes: 50) do |_|
    puts JobOpening.find_or_create({psql_id: "fe1fb5be-69ca-47a9-813e-4496b6790035"}, {title: "Route Driver for Vending Company"})
end

While this hangs most of the times:

Parallel.each((1..3000), in_threads: 50) do |_|
    puts JobOpening.find_or_create({psql_id: "fe1fb5be-69ca-47a9-813e-4496b6790035"}, {title: "Route Driver for Vending Company"})
end

As a side note, I had no problem running both code samples using neo4jrb v9.6 and neo4j 3.5.19.

Runtime information:

Neo4j database version: 4.0.7 neo4j gem version: 10.0.1 neo4j-ruby-driver gem version: 1.7.0 seabolt: compiled from source, ubuntu 20.04

If you need more information feel free to ask. Love your work by the way!

efivash commented 3 years ago

I ran into this as well on macOS. It appears to happen when trying to call Neo4j::Driver::DirectConnectionProvider#acquire_connection at the same time in muliple threads, since the threads are sharing the FFI seabolt connector object.

Despite the fact that It looks like active graph tries to put explicit_session and tx in thread-specific variables, ActiveGraphTransactions#send_transaction calls driver.session which gabs a memoized driver that is shared between threads. When the driver tries to build a session for each thread, it does so with the same session factory initiated in the shared driver, which provides the same connection_provider to each session.

I found that these two methods consistently hang, the second method represents the path ActiveGraphTransactions takes when trying to send a transaction if tx and explicit_session are both nil or not open (as they will be in a new thread)

def hang1
   threads = (1..10).map do |_n|
     Thread.new do
       (1..100).each do |_m|
         ActiveGraph::Base.read_transaction {}
       end
     end
   end
   threads.map(&:join)
 end

def hang2
   threads = (1..10).map do |_n|
     Thread.new do
       (1..100).each do |_m|
         ActiveGraph::Base.driver.session.send(:acquire_connection, Neo4j::Driver::AccessMode::READ)
       end
     end
   end
   threads.map(&:join)
end
klobuczek commented 3 years ago

@efivash @mperice let's move the discussion to https://github.com/neo4jrb/neo4j-ruby-driver/pull/47