neo4jrb / activegraph

An active model wrapper for the Neo4j Graph Database for Ruby.
http://neo4jrb.io
MIT License
1.4k stars 276 forks source link

Too many connections error #555

Closed mattsnyder closed 8 years ago

mattsnyder commented 9 years ago

I'm seeing this error when using the neo4j gem to talk to our Neo4j cluster. It literally just started. Do we need to tune our cluster settings? Have you seen this before? The CPU and memory load on the servers in nominal.

screen shot 2014-11-06 at 12 57 01 pm

mattsnyder commented 9 years ago

We are using v3.0.0.rc.3

subvertallchris commented 9 years ago

Haven't seen this but definitely upgrade to the latest version. We improved the way connections are handled a while ago, it might help.

subvertallchris commented 9 years ago

Actually, I can see from that log that you're using neo4j-core 3.0.1 with net http persistent, which is the change to connection handling. I'd still update the gem first, all bets are off with the rc.

mattsnyder commented 9 years ago

Good to know. The RC felt good enough to use :)

I'll do the update. As a side, we also found Neo4J was configured to not use as much memory as it should have been. In fact, is there a good resource on tuning Neo4j for production environments?

subvertallchris commented 9 years ago

The RCs were all stable and I think some folks did put some little things into production! So many features were added, adjusted, and stabilized since then that we can never be sure that things will work just right when people use them.

The Neo website has a big section on performance tuning that covers memory among other things. On my phone so I can't link you but it's easy to find. Make sure your servers are running 2.1.5 and you use the guide for that version, too!

On Thursday, November 6, 2014, Matt Snyder notifications@github.com wrote:

Good to know. The RC felt good enough to use :)

I'll do the update. As a side, we also found Neo4J was configured to not use as much memory as it should have been. In fact, is there a good resource on tuning Neo4j for production environments?

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/555#issuecomment-62053773.

subvertallchris commented 9 years ago

Did you have any luck with this?

mattsnyder commented 9 years ago

@subvertallchris Upgrading the gem seems to have resolved this. As well as adjusting the Neo4j configuration

subvertallchris commented 9 years ago

Great! Enjoy!

mattsnyder commented 9 years ago

@subvertallchris Question. If I use index: exact or constraint: :unique, do I manually need to add the index to Neo4j?

Also, how well does ha-cluster work with v3.0?

subvertallchris commented 9 years ago

A unique constraint also acts as an exact index, so you only need to specify the one. Keep in mind that if you change an index into a constraint within an existing database, you'll need to drop the old index from the web console.

As for clustering... not well, since it doesn't differentiate between read/write queries. Nothing on that linked page applies to the 3.0 release of the gem at all. I started a branch to provide HA support to the gem, even tried to have it ready for the 3.0 release, but didn't have enough time or a good test environment. Ultimately decided to focus on other features until someone requested it. So... is this something you'd say you need?

mattsnyder commented 9 years ago

When adding a new index or constraint, does the existing data need to be touched or will it automatically index the pre-existing data?

We are running an HA cluster in Production. So I'd say yes :)

As a side note, I'd love to pitch in on this repo wherever there is need. So far you guys have been rocking it, but if I can lend a hand I will.

subvertallchris commented 9 years ago

Nope, the data doesn't need to be touched, Neo4j handles it for you for the new index is created.

If you could help out with the HA stuff, that would be awesome. Honestly, the biggest thing I was lacking was a good test environment and someone who could actually run it through its paces. If you want, hop over to https://github.com/neo4jrb/neo4j-core/issues/96. I'm going to post the process of how I'm expecting it to work and the questions I still have.

subvertallchris commented 9 years ago

Oh yeah, one other thing. If you're using JRuby and Neo4j Embedded, I think there is good support for High Availability. I know that Volker made some changes in the past month or two that added this.

mattsnyder commented 9 years ago

@subvertallchris So even after upgrading the gems, this popped back up screen shot 2014-11-18 at 4 56 28 pm

It's not consistent and I'm having trouble associating it with any changes in load or number of queries

subvertallchris commented 9 years ago

This seems to be a sort-of-well-known net-http-persistent bug. It's discussed at length at https://github.com/sparklemotion/mechanize/issues/123, maybe something in there can help you pin it down.

If you can't get it pinned down there, I hate to say this but I think you should open an issue at https://github.com/drbrain/net-http-persistent. I'm happy to help with troubleshooting but without being able to reproduce, I think it'd be best for you to lead the charge.

subvertallchris commented 9 years ago

Also https://github.com/drbrain/net-http-persistent/issues/37

subvertallchris commented 9 years ago

In that thread, see the comment from mislav that starts with "I was bitten by the same issue as"

mattsnyder commented 9 years ago

:+1:

cheerfulstoic commented 8 years ago

This issue is almost a year old and three versions behind. I'm going to close it, but please comment / reopen if you're still having trouble with 5.x or the 6.x alpha

abuisman commented 8 years ago

I am using version 6.1 of Neo4jrb and ruby 2.1.8 and I get the same error when, for example, trying to create a node ('GraphQuantity.create(....)'):

Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError)

subvertallchris commented 8 years ago

We've had reports of serious performance issues with Ruby 2.1. Can you upgrade to 2.2.1 or greater?

On Thursday, January 21, 2016, Achilleas notifications@github.com wrote:

I am using version 6.1 of Neo4jrb and ruby 2.1.8 and I get the same error when, for example, trying to create a node ('GraphQuantity.create(....)'):

Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError)

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/555#issuecomment-173632558.

abuisman commented 8 years ago

@subvertallchris Yes I figured that would be an issue. In theorie we could upgrade, but seeing as we'd integrate neo4j in a pretty big existing project it would take some more work to do so. I was hoping this issue would've been fixed for 2.1.x. No such luck I guess? ;)

subvertallchris commented 8 years ago

Unfortunately not. All signs indicated it being an issue in a dependency, my money is with Net::Http::Persistent or Faraday, so it's out of our hands.

On Thursday, January 21, 2016, Achilleas notifications@github.com wrote:

@subvertallchris https://github.com/subvertallchris Yes I figured that would be an issue. In theorie we could upgrade, but seeing as we'd integrate neo4j in a pretty big existing project it would take some more work to do so. I was hoping this issue would've been fixed for 2.1.x. No such luck I guess? ;)

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/555#issuecomment-173642693.

abuisman commented 8 years ago

That is unlucky for me :(. I just checked with ruby 2.2.2 and it doesn't work either. I will try 2.3.x, or would that futile?

subvertallchris commented 8 years ago

Interesting. What's the query that you're trying to execute? Also, can you tell me a bit about your environment? Is the server local and what OS are you running?

subvertallchris commented 8 years ago

And no, if 2.2.2 isn't behaving then I don't expect 2.3.x will.

abuisman commented 8 years ago

Haha yes you are probably right.

I am running 2.3.2 - COMMUNITY version on OS X using the community edition .app My Rails version is '4.2.3'.

The query:

GraphQuantity.all

My model:

class GraphQuantity
  include Neo4j::ActiveNode

  property :name, type: String

  has_many :in, :children, type: :child_quantity_of, model_class: :GraphQuantity
  has_one :out, :parent, origin: :children
end
subvertallchris commented 8 years ago

Seems straightforward enough. How many nodes are you expecting to have returned from the db?

abuisman commented 8 years ago

Right now there are 0 in the database. I created a new one just to play around with.

subvertallchris commented 8 years ago

Ah, I see. In your code or even in the Rails console, one time, try calling GraphQuantity.first before doing anything else. That might make some schema changes that are known to cause problems when run within transactions.

abuisman commented 8 years ago

Sadly that doesn't work either:

[3] pry(main)> GraphQuantity.first
Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError) after 0 requests on 2237191000, last used 1.440335 seconds ago
from /Users/achilleas/.rbenv/versions/2.3.0/lib/ruby/2.3.0/net/protocol.rb:211:in `write'
[4] pry(main)> GraphQuantity.all
Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError) after 0 requests on 2237191000, last used 4.27705 seconds ago
from /Users/achilleas/.rbenv/versions/2.3.0/lib/ruby/2.3.0/net/protocol.rb:211:in `write'
subvertallchris commented 8 years ago

That is really weird... Is the Neo4j web console working?

On Thursday, January 21, 2016, Achilleas notifications@github.com wrote:

Sadly that doesn't work either:

[3] pry(main)> GraphQuantity.first Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError) after 0 requests on 2237191000, last used 1.440335 seconds ago from /Users/achilleas/.rbenv/versions/2.3.0/lib/ruby/2.3.0/net/protocol.rb:211:in write' [4] pry(main)> GraphQuantity.all Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError) after 0 requests on 2237191000, last used 4.27705 seconds ago from /Users/achilleas/.rbenv/versions/2.3.0/lib/ruby/2.3.0/net/protocol.rb:211:inwrite'

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/555#issuecomment-173669570.

abuisman commented 8 years ago

Yes both the web console and the neo4j-shell are working fine. I just created a node and it works.

subvertallchris commented 8 years ago

Do you have auth enabled on the Neo4j server? Has it ever worked in the past with the Neo4j gem or is this your first time?

You can also try going through bundle exec irb:

require 'neo4j-core'
Neo4j::Session.open(:server_db, 'http://localhost:7474')
Neo4j::Session.current.query.match(:n).limit(1).pluck(:n).first

If that works, we might be able to troubleshoot something more specific.

On Thursday, January 21, 2016, Achilleas notifications@github.com wrote:

Yes both the web console and the neo4j-shell are working fine. I just created a node and it works.

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/555#issuecomment-173675331.

abuisman commented 8 years ago

I have an other (older) project that I generated with Neo4j from the get go that works fine with neo4j 5.2.7, so it is either an issue with the version or a conflict with another gem.

I have auth enabled so I ran:

Neo4j::Session.open(:server_db, 'http://localhost:7474', basic_auth: { username: 'neo4j', password: '....'})
=> Neo4j::Server::CypherSession url: 'http://localhost:7474/db/data/' version: '2.3.2'

[4] pry(main)> Neo4j::Session.current.query.match(:n).limit(1).pluck(:n).first
NoMethodError: undefined method `fetch' for nil:NilClass
abuisman commented 8 years ago

The conflicting gem seems unlikely come to think of it because I ran the above in a 'clean' pry.

subvertallchris commented 8 years ago

I think there is a bug somewhere in the sample query I gave you or the code when parsing its response... We just need to see if we're able to successfully perform a basic Cypher query through neo4j-core. If we can, then we assume that your session is ok and there's a problem with another operation in the more Rails-oriented gem.

A little background for you: Neo4j-core is a dependency of Neo4j, it handles all communication with the server among other things. The modification of indexes and constraints as defined in the models begins after a session is established and a model is loaded, and this process can sometimes cause problems. This isn't one of the problems it normally causes but it might be possible if there is a corrupt index or significant number of nodes, requiring a brutal lock to scan and verify uniqueness.

Can you also make sure you are using the most recent patch release of the Neo4j gem if you haven't already?

On Thursday, January 21, 2016, Achilleas <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

The conflicting gem seems unlikely come to think of it because I ran the above in a 'clean' pry.

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/555#issuecomment-173678283.

abuisman commented 8 years ago

I have this version in my Gemfile: gem 'neo4j', '~> 6.1', '>= 6.1.3' I added it today, so it should be the newest.

abuisman commented 8 years ago

I just ran the query you told me earlier and now it looks like it worked:

[2] pry(main)> Neo4j::Session.open(:server_db, 'http://localhost:7474', basic_auth: { username: 'neo4j', password: '...'})
=> Neo4j::Server::CypherSession url: 'http://localhost:7474/db/data/' version: '2.3.2'
[4] pry(main)> Neo4j::Session.current.query.match(:n).limit(1).pluck(:n).first
=> CypherNode 0 (70146770242820)
abuisman commented 8 years ago

When I run the above in my rails console I get an error again:

[4] pry(main)> Neo4j::Session.current.query.match(:n).limit(1).pluck(:n).first
 CYPHER 6ms MATCH n RETURN n LIMIT {limit_1} | {:limit_1=>1}
Net::HTTP::Persistent::Error: too many connection resets (due to closed stream - IOError) after 0 requests on 2281254360, last used 1453405996.86403 seconds ago
from /Users/achilleas/.rbenv/versions/2.3.0/lib/ruby/2.3.0/net/protocol.rb:211:in `write'

So it seems like a conflicting gem or configuration. What do you think?

abuisman commented 8 years ago

Ok so I disabled a whole lot of gems and now it works. Time to trial-and-error the gems one by one ;) I will keep you updated. Thanks a lot so far for your time and patience :+1:

abuisman commented 8 years ago

It seems I have found the culprit:

gem 'newrelic_rpm', '~> 3.13.2.302'

When I comment this out and rebundle neo4jrb works again :).

I will try updating it and I will try the other ruby versions as well to check it they work without/the updated newrelic.

abuisman commented 8 years ago

gem 'newrelic_rpm', '~> 3.14', '>= 3.14.1.311'

Works with Neo4jrb :) Now for Ruby 2.1.x ;).

abuisman commented 8 years ago

"The original ruby version I tried using (2.1.2) works once I update the newrelic_rpm version as above :). Thanks a lot Chris @subvertallchris, for the patience and brainstorming."

Was what I had written until I noticed that running GraphQuantity.first didn't work again, but when I ran bin/spring stop && bin/rails c again it worked again... When I was debugging gems I suddenly noticed the same 'flickering' when I was checking if it really was newrelic or not. It seems as if sometimes the connection works and sometimes it doesn't, but stopping spring and then opening the rails console often helps! I will investigate this some more tomorrow with the same older version of newrelic to see if it really is that, or if there is a flickering caused by something else.

subvertallchris commented 8 years ago

That's really interesting, please keep us posted!

cheerfulstoic commented 8 years ago

Awesome debugging, thanks! Hopefully this will help somebody else who comes along!

abuisman commented 8 years ago

Good morning guys :). Welcome @cheerfulstoic thanks for joining in.

Apparently newrelic wasn't really it, I created a screencapture showing what is going on now: https://achilleas.nl/neo4j-why.mp4

Apparently the connection is very unreliable. I will try with Ruby 2.3, but first I will check if running neo4j in a Docker container is more stable than using the .app version.

Update:

Running in Docker does not fix anything. So I will update ruby once again.

Update 2:

Using Ruby 2.3 does not improve things and I already had newrelic disabled. I will start disabling gems again to see if they give us any issues.

I also updated OS X to 10.11.3 (latest), but that hasn't changed anything ;).

abuisman commented 8 years ago

Ok so I believe I really solved it this time ;).

The flickering threw me off and made me think it was newrelic. Now I disabled the gem gem 'fakeweb', '~> 1.3.0' and it turns out this was really the issue. I will start playing around with neo4jrb today and will discover if this really solved the issue, but I am pretty confident. Pretty logical if you think of it :).

If it turns out Fakeweb is the issue here I will open an issue over there to see if Thilo and Wes can fix it.

cheerfulstoic commented 8 years ago

That's why I always use real webz

http://www.smh.com.au/content/dam/images/1/u/j/b/6/image.related.articleLeadwide.620x349.1ujov.png/1331095890303.jpg

abuisman commented 8 years ago

@cheerfulstoic The real question is why fakeweb was in de :development group in the Gemfile. There is no real issue with using fakeweb in tests.

Also, it seems we don't use it anymore anyway, so I can remove the gem. Still makes for some 'wasted' hours of debugging though ;).