Closed subvertallchris closed 10 years ago
I deployed my app to Heroku and got access to a Graphene S1 instance, 512MB RAM. With both servers hosted in EC2, I thought I might see better performance and I did.
irb(main):037:0> puts Benchmark.measure { Student.limit(900).each { |s| } }
CYPHER 326ms MATCH (result:`Student`) RETURN result LIMIT 900
0.370000 0.030000 0.400000 ( 0.604920)
=> nil
irb(main):038:0> puts Benchmark.measure { Student.limit(900).each { |s| } }
CYPHER 96ms MATCH (result:`Student`) RETURN result LIMIT 900
0.460000 0.020000 0.480000 ( 0.522264)
=> nil
irb(main):039:0> puts Benchmark.measure { Student.limit(900).each { |s| } }
CYPHER 98ms MATCH (result:`Student`) RETURN result LIMIT 900
0.310000 0.030000 0.340000 ( 0.376899)
Across 10...
irb(main):040:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } }
CYPHER 131ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 165ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 160ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 87ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 289ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 91ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 256ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 138ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 146ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 179ms MATCH (result:`Student`) RETURN result LIMIT 900
3.520000 0.200000 3.720000 ( 4.571002)
irb(main):041:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } }
CYPHER 163ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 270ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 89ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 193ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 112ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 186ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 168ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 147ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 169ms MATCH (result:`Student`) RETURN result LIMIT 900
CYPHER 96ms MATCH (result:`Student`) RETURN result LIMIT 900
3.550000 0.230000 3.780000 ( 4.475665)
We're down within shooting range of JRuby/embedded. When you factor in the added cost of dealing with a JRuby server in production, that extra time might not matter all that much. Ruby MRI keeps getting faster, but JRuby...?
My conclusion? Use MRI/Server for everything that isn't a massive write operation that requires high performance with no return values. Any performance gains offered by embedded's superior response times are lost to JRuby after data is returned.
Very interesting. I'm a bit surprised that JRuby was slower than MRI. Maybe JRuby would be faster with some more performance tuning JVM/JRuby flags.
I think the main benefits of using embedded is to access the Java Neo4j traversal API. Sometime cypher queries does not give you all the power you need. This is why I'm very interested in the https://github.com/pangloss/pacer gem,
I also guess that our JRuby neo4j-core could be optimised by not using cypher but instead using the Java traversal api.
On Mon, Sep 22, 2014 at 7:55 PM, Chris Grigg notifications@github.com wrote:
I deployed my app to Heroku and got access to a Graphene S1 instance, 512MB RAM. With both servers hosted in EC2, I thought I might see better performance and I did.
irb(main):037:0> puts Benchmark.measure { Student.limit(900).each { |s| } } CYPHER 326ms MATCH (result:
Student
) RETURN result LIMIT 900 0.370000 0.030000 0.400000 ( 0.604920) => nil irb(main):038:0> puts Benchmark.measure { Student.limit(900).each { |s| } } CYPHER 96ms MATCH (result:Student
) RETURN result LIMIT 900 0.460000 0.020000 0.480000 ( 0.522264) => nil irb(main):039:0> puts Benchmark.measure { Student.limit(900).each { |s| } } CYPHER 98ms MATCH (result:Student
) RETURN result LIMIT 900 0.310000 0.030000 0.340000 ( 0.376899)Across 10...
irb(main):040:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } } CYPHER 131ms MATCH (result:
Student
) RETURN result LIMIT 900 CYPHER 165ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 160ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 87ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 289ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 91ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 256ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 138ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 146ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 179ms MATCH (result:Student
) RETURN result LIMIT 900 3.520000 0.200000 3.720000 ( 4.571002)irb(main):041:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } } CYPHER 163ms MATCH (result:
Student
) RETURN result LIMIT 900 CYPHER 270ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 89ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 193ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 112ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 186ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 168ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 147ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 169ms MATCH (result:Student
) RETURN result LIMIT 900 CYPHER 96ms MATCH (result:Student
) RETURN result LIMIT 900 3.550000 0.230000 3.780000 ( 4.475665)We're down within shooting range of JRuby/embedded.
My conclusion? Use MRI/Server for everything that isn't a massive write operation that requires high performance with no return values. Any performance gains offered by embedded's superior response times are lost to JRuby after data is returned.
— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/482#issuecomment-56412616.
Yeah, I'd like to see if we can pinpoint some of the problem areas. I'm willing to bet that anything in the code that speeds up JRuby will also speed up MRI. That's a very good point about JRuby's main benefit being access to the full Java API!
Neat stuff! The only reason I'm still thinking about MRI for development and JRuby in production is the use of transactions. I only played with it a bit, but could not get transactions working for MRI because of the current limitations, I suspect.
As a side note, here is a talk on a similar scenario (MRI for dev and JRuby for prod): http://vimeo.com/45719570
As of neo4j-core 3.0.1, transactions in MRI with Neo4j server work perfectly. Give it a shot! It just occurred to me that I forgot to update the documentation after this was done, I'll take care of that tomorrow.
On Saturday, October 11, 2014, Serge Bóinn notifications@github.com wrote:
Neat stuff! The only reason I'm still thinking about MRI for development and JRuby in production is the use of transactions. I only played with it a bit, but could not get transactions working for MRI because of the current limitations https://github.com/neo4jrb/neo4j-core/wiki/Transaction#limitations, I suspect.
As a side note, here is a talk on a similar scenario (MRI for dev and JRuby for prod): http://vimeo.com/45719570
— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/482#issuecomment-58769252.
Hey, look at that! It does appear to be working now :smile_cat: Just need to sort out some deadlocking issues... Thanks!
Awesome! Let us know if you need a hand!
On Sunday, October 12, 2014, Serge Bóinn notifications@github.com wrote:
Hey, look at that! It does appear to be working now [image: :smile_cat:] Just need to sort out some deadlocking issues... Thanks!
— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/482#issuecomment-58824768.
Did you try the benchmarks with the transactional endpoint?
I did not, this was before support was implemented. Writes should be faster, right? Do you think we'd see any other differences?
Inspired by http://stackoverflow.com/questions/25976190/performance-of-java-api-versus-python-with-cypher-for-neo4j, I just ran some very basic benchmarks with the gem, comparing MRI/Server to JRuby/Embedded. The MRI requests all use the net-http-persistent gem with Faraday.
The tl;dr version is that while Neo4j Embedded is faster, JRuby itself is so much slower that MRI still wins when working locally.
I created 5000 nodes and ran some queries against them.
MRI/Server:
JRuby:
10 times with MRI:
10 times with JRuby:
This demonstrates that while JRuby would be faster for bulk imports, if you wanted to return data and actually use it, MRI is still faster.
What if we're in production, comparing an MRI web server connecting to a remote Neo4j Server VS a JRuby server running Torquebox with Neo4j embedded? I tested with a Graphene instance, 192MB RAM on a server on EC2 in Virginia. I'm in Brooklyn, about 6 hours away. Unfortunately, I could only create 900 nodes on it but we can still get some good figures from that.
MRI:
JRuby:
10 times... MRI:
JRuby:
JRuby clearly wins there because of the Cypher response time bottleneck. In production, I think we'd assume that the web servers and database servers would be within the same network, so while we wouldn't quite get local speed I think we would get something close to it. We also would have a more powerful database server.
For now, I think it's clear that MRI outperforms JRuby locally when you are returning nodes, though JRuby/embedded is still a good option if you need to do a bulk import and aren't concerned with returning nodes. It shows that we may be clear up bottlenecks that occur after a Cypher response is received.