An interesting benchmark

subvertallchris commented 10 years ago

Inspired by http://stackoverflow.com/questions/25976190/performance-of-java-api-versus-python-with-cypher-for-neo4j, I just ran some very basic benchmarks with the gem, comparing MRI/Server to JRuby/Embedded. The MRI requests all use the net-http-persistent gem with Faraday.

The tl;dr version is that while Neo4j Embedded is faster, JRuby itself is so much slower that MRI still wins when working locally.

I created 5000 nodes and ran some queries against them.

MRI/Server:

2.1.2 :014 > puts Benchmark.measure { Student.limit(5000).each { |s| } } 
 CYPHER 307ms MATCH (result:`Student`) RETURN result LIMIT 5000
  1.110000   0.030000   1.140000 (  1.216936)

JRuby:

jruby-1.7.12 :013 > puts Benchmark.measure { Student.limit(5000).each { |s| } } 
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 5000
  2.750000   0.070000   2.820000 (  1.560000)

10 times with MRI:

2.1.2 :005 > puts Benchmark.measure { 10.times { Student.limit(5000).to_a } }
 CYPHER 238ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 349ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 336ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 256ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 385ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 547ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 355ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 208ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 535ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 310ms MATCH (result:`Student`) RETURN result LIMIT 5000
 11.570000   0.210000  11.780000 ( 13.066349)

10 times with JRuby:

jruby-1.7.12 :012 > puts Benchmark.measure { 10.times { Student.limit(5000).to_a } }
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 4ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 1ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 1ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 1ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 1ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 3ms MATCH (result:`Student`) RETURN result LIMIT 5000
 CYPHER 1ms MATCH (result:`Student`) RETURN result LIMIT 5000
 29.210000   0.600000  29.810000 ( 15.779000)

This demonstrates that while JRuby would be faster for bulk imports, if you wanted to return data and actually use it, MRI is still faster.

What if we're in production, comparing an MRI web server connecting to a remote Neo4j Server VS a JRuby server running Torquebox with Neo4j embedded? I tested with a Graphene instance, 192MB RAM on a server on EC2 in Virginia. I'm in Brooklyn, about 6 hours away. Unfortunately, I could only create 900 nodes on it but we can still get some good figures from that.

MRI:

2.1.2 :004 > puts Benchmark.measure { Student.limit(900).each { |s| } }
 CYPHER 662ms MATCH (result:`Student`) RETURN result LIMIT 900
  0.220000   0.020000   0.240000 (  0.808158)

JRuby:

jruby-1.7.12 :004 > puts Benchmark.measure { Student.limit(900).each { |s| } }
 CYPHER 6ms MATCH (result:`Student`) RETURN result LIMIT 900
  1.000000   0.020000   1.020000 (  0.581000)

10 times... MRI:

2.1.2 :019 > puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } }
 CYPHER 538ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 398ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 329ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 370ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 370ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 530ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 473ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 440ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 429ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 470ms MATCH (result:`Student`) RETURN result LIMIT 900
  2.530000   0.160000   2.690000 (  6.041452)

JRuby:

jruby-1.7.12 :006 >   puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } }
 CYPHER 5ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 3ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 3ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 3ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 3ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 2ms MATCH (result:`Student`) RETURN result LIMIT 900
  7.600000   0.170000   7.770000 (  3.478000)

JRuby clearly wins there because of the Cypher response time bottleneck. In production, I think we'd assume that the web servers and database servers would be within the same network, so while we wouldn't quite get local speed I think we would get something close to it. We also would have a more powerful database server.

For now, I think it's clear that MRI outperforms JRuby locally when you are returning nodes, though JRuby/embedded is still a good option if you need to do a bulk import and aren't concerned with returning nodes. It shows that we may be clear up bottlenecks that occur after a Cypher response is received.

subvertallchris commented 10 years ago

I deployed my app to Heroku and got access to a Graphene S1 instance, 512MB RAM. With both servers hosted in EC2, I thought I might see better performance and I did.

irb(main):037:0> puts Benchmark.measure { Student.limit(900).each { |s| } }
 CYPHER 326ms MATCH (result:`Student`) RETURN result LIMIT 900
  0.370000   0.030000   0.400000 (  0.604920)
=> nil
irb(main):038:0> puts Benchmark.measure { Student.limit(900).each { |s| } }
 CYPHER 96ms MATCH (result:`Student`) RETURN result LIMIT 900
  0.460000   0.020000   0.480000 (  0.522264)
=> nil
irb(main):039:0> puts Benchmark.measure { Student.limit(900).each { |s| } }
 CYPHER 98ms MATCH (result:`Student`) RETURN result LIMIT 900
  0.310000   0.030000   0.340000 (  0.376899)

Across 10...

irb(main):040:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } }
 CYPHER 131ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 165ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 160ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 87ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 289ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 91ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 256ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 138ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 146ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 179ms MATCH (result:`Student`) RETURN result LIMIT 900
  3.520000   0.200000   3.720000 (  4.571002)

irb(main):041:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } }
 CYPHER 163ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 270ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 89ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 193ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 112ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 186ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 168ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 147ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 169ms MATCH (result:`Student`) RETURN result LIMIT 900
 CYPHER 96ms MATCH (result:`Student`) RETURN result LIMIT 900
  3.550000   0.230000   3.780000 (  4.475665)

We're down within shooting range of JRuby/embedded. When you factor in the added cost of dealing with a JRuby server in production, that extra time might not matter all that much. Ruby MRI keeps getting faster, but JRuby...?

My conclusion? Use MRI/Server for everything that isn't a massive write operation that requires high performance with no return values. Any performance gains offered by embedded's superior response times are lost to JRuby after data is returned.

andreasronge commented 10 years ago

Very interesting. I'm a bit surprised that JRuby was slower than MRI. Maybe JRuby would be faster with some more performance tuning JVM/JRuby flags.

I think the main benefits of using embedded is to access the Java Neo4j traversal API. Sometime cypher queries does not give you all the power you need. This is why I'm very interested in the https://github.com/pangloss/pacer gem,

I also guess that our JRuby neo4j-core could be optimised by not using cypher but instead using the Java traversal api.

On Mon, Sep 22, 2014 at 7:55 PM, Chris Grigg notifications@github.com wrote:

I deployed my app to Heroku and got access to a Graphene S1 instance, 512MB RAM. With both servers hosted in EC2, I thought I might see better performance and I did.

irb(main):037:0> puts Benchmark.measure { Student.limit(900).each { |s| } } CYPHER 326ms MATCH (result:Student) RETURN result LIMIT 900 0.370000 0.030000 0.400000 ( 0.604920) => nil irb(main):038:0> puts Benchmark.measure { Student.limit(900).each { |s| } } CYPHER 96ms MATCH (result:Student) RETURN result LIMIT 900 0.460000 0.020000 0.480000 ( 0.522264) => nil irb(main):039:0> puts Benchmark.measure { Student.limit(900).each { |s| } } CYPHER 98ms MATCH (result:Student) RETURN result LIMIT 900 0.310000 0.030000 0.340000 ( 0.376899)

Across 10...

irb(main):040:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } } CYPHER 131ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 165ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 160ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 87ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 289ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 91ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 256ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 138ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 146ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 179ms MATCH (result:Student) RETURN result LIMIT 900 3.520000 0.200000 3.720000 ( 4.571002)

irb(main):041:0> puts Benchmark.measure { 10.times { Student.limit(900).each { |s| } } } CYPHER 163ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 270ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 89ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 193ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 112ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 186ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 168ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 147ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 169ms MATCH (result:Student) RETURN result LIMIT 900 CYPHER 96ms MATCH (result:Student) RETURN result LIMIT 900 3.550000 0.230000 3.780000 ( 4.475665)

We're down within shooting range of JRuby/embedded.

My conclusion? Use MRI/Server for everything that isn't a massive write operation that requires high performance with no return values. Any performance gains offered by embedded's superior response times are lost to JRuby after data is returned.

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/482#issuecomment-56412616.

subvertallchris commented 10 years ago

Yeah, I'd like to see if we can pinpoint some of the problem areas. I'm willing to bet that anything in the code that speeds up JRuby will also speed up MRI. That's a very good point about JRuby's main benefit being access to the full Java API!

buildc0de commented 10 years ago

Neat stuff! The only reason I'm still thinking about MRI for development and JRuby in production is the use of transactions. I only played with it a bit, but could not get transactions working for MRI because of the current limitations, I suspect.

As a side note, here is a talk on a similar scenario (MRI for dev and JRuby for prod): http://vimeo.com/45719570

subvertallchris commented 10 years ago

As of neo4j-core 3.0.1, transactions in MRI with Neo4j server work perfectly. Give it a shot! It just occurred to me that I forgot to update the documentation after this was done, I'll take care of that tomorrow.

On Saturday, October 11, 2014, Serge Bóinn notifications@github.com wrote:

Neat stuff! The only reason I'm still thinking about MRI for development and JRuby in production is the use of transactions. I only played with it a bit, but could not get transactions working for MRI because of the current limitations https://github.com/neo4jrb/neo4j-core/wiki/Transaction#limitations, I suspect.

As a side note, here is a talk on a similar scenario (MRI for dev and JRuby for prod): http://vimeo.com/45719570

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/482#issuecomment-58769252.

buildc0de commented 10 years ago

Hey, look at that! It does appear to be working now :smile_cat: Just need to sort out some deadlocking issues... Thanks!

subvertallchris commented 10 years ago

Awesome! Let us know if you need a hand!

On Sunday, October 12, 2014, Serge Bóinn notifications@github.com wrote:

Hey, look at that! It does appear to be working now [image: :smile_cat:] Just need to sort out some deadlocking issues... Thanks!

— Reply to this email directly or view it on GitHub https://github.com/neo4jrb/neo4j/issues/482#issuecomment-58824768.

jexp commented 10 years ago

Did you try the benchmarks with the transactional endpoint?

subvertallchris commented 10 years ago

I did not, this was before support was implemented. Writes should be faster, right? Do you think we'd see any other differences?

neo4jrb / activegraph

An interesting benchmark #482