TryGhost / Ghost

Independent technology for modern publishing, memberships, subscriptions and newsletters.
https://ghost.org
MIT License
47.51k stars 10.37k forks source link

Fully updated - gives poor performance with Raspberry PI 2 #6258

Closed alexellis closed 8 years ago

alexellis commented 8 years ago

I've noticed a huge slow-down with the latest version of Ghost and Node 4.x

A fully updated version of Ghost blog running on node-v4.2.3-linux-armv7l with ghost-0.7.3 t took almost an hour to process npm install --production. After this Apache Bench only gave me quite a poor throughput vs Node 0.12 and previous version of ghost. Details below.

Both systems are otherwise fully updated running on the same spec SD cards.

What could have caused this performance hit?

Updated and Node 4.x

ab -n 100 -c 1 http://192.168.0.x:2368/
This is ApacheBench, Version 2.3 <$Revision: 1663405 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.0.x (be patient).....done

Server Software:        
Server Hostname:        192.168.0.x
Server Port:            2368

Document Path:          /

Concurrency Level:      1
Time taken for tests:   13.655 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      459000 bytes
HTML transferred:       433400 bytes
Requests per second:    7.32 [#/sec] (mean)
Time per request:       136.549 [ms] (mean)
Time per request:       136.549 [ms] (mean, across all concurrent requests)
Transfer rate:          32.83 [Kbytes/sec] received

Node 0.12 and ghost 0.6.2

ab -n 100 -c 1 http://192.168.0.y:2368/
This is ApacheBench, Version 2.3 <$Revision: 1663405 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.0.y (be patient).....done

Server Software:        
Server Hostname:        192.168.0.y
Server Port:            2368

Document Path:          /

Concurrency Level:      1
Time taken for tests:   9.981 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      835300 bytes
HTML transferred:       810000 bytes
Requests per second:    10.02 [#/sec] (mean)
Time per request:       99.810 [ms] (mean)
Time per request:       99.810 [ms] (mean, across all concurrent requests)
Transfer rate:          81.73 [Kbytes/sec] received
acburdine commented 8 years ago

@alexellis as to the npm install, that can depend on what version of npm you are using. Npm has taken a larger and larger amount of memory with each subsequent version. As to the node/ghost version difference, it might be helpful to have results of testing with ghost 0.7.3 with node 0.12, just to see if it's a Ghost issue or a node issue :smile:

There's also issue #6211 which seems to show something along those lines.

ErisDS commented 8 years ago

Just wanted to reiterate what @acburdine says here. npm install being slow is an issue to raise with npm, rather than Ghost - we have zero control over how npm works.

As for the change in performance you're seeing, we'd ideally need to see a benchmark from 0.6.2 and node v0.12 compared with Ghost 0.7.3 on node v0.12 and then again with Ghost 0.7.3 and node v4.2 upgraded separately.

Testing both together means there's no way to know whether the problem comes from changes in Ghost or in Node. It's also worth repeating the tests, to ensure the change is reproducible.

alexellis commented 8 years ago

NPM aside (being a one-shot process) I still want to understand how/why performance has taken a hit.

These two sets of results are taken from the same 'production server' running Arch Linux. In this example the first system has a populated SQLite DB and the second is an 'empty' database on a fresh installation.

On a separate, fully updated Arch Linux Raspberry PI 2 I am getting worse performance with Node 4.2.3 and with Node 0.12.7. The binaries are not provided by node's dist folder and I have had to rely on an old version of the binary rather than being able to re-build from source.

I mentioned before that V8 apparently no longer supports VFP2 as of about version 3 of io.js - this sounds like it could be a candidate for the drop in performance on ARM. Can anyone speak to that?

0.12.7 and Ghost 0.6.2 Run #1

Server Port:            2368

Document Path:          /

Concurrency Level:      1
Time taken for tests:   10.504 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      835300 bytes
HTML transferred:       810000 bytes
Requests per second:    9.52 [#/sec] (mean)
Time per request:       105.043 [ms] (mean)
Time per request:       105.043 [ms] (mean, across all concurrent requests)
Transfer rate:          77.66 [Kbytes/sec] received

0.12.7 and Ghost 0.6.2 Run #2

Server Port:            2368

Document Path:          /

Concurrency Level:      1
Time taken for tests:   9.593 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      835300 bytes
HTML transferred:       810000 bytes
Requests per second:    10.42 [#/sec] (mean)
Time per request:       95.928 [ms] (mean)
Time per request:       95.928 [ms] (mean, across all concurrent requests)
Transfer rate:          85.04 [Kbytes/sec] received

Ghost 0.7.3 Node 0.12.7

Server Port:            2370

Document Path:          /

Concurrency Level:      1
Time taken for tests:   10.498 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      459000 bytes
HTML transferred:       433400 bytes
Requests per second:    9.53 [#/sec] (mean)
Time per request:       104.979 [ms] (mean)
Time per request:       104.979 [ms] (mean, across all concurrent requests)
Transfer rate:          42.70 [Kbytes/sec] received

Ghost 0.7.3 with Node 4.2.3 (After several warm-up runs)

Server Port:            2370

Document Path:          /

Concurrency Level:      1
Time taken for tests:   6.778 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      459000 bytes
HTML transferred:       433400 bytes
Requests per second:    14.75 [#/sec] (mean)
Time per request:       67.777 [ms] (mean)
Time per request:       67.777 [ms] (mean, across all concurrent requests)
Transfer rate:          66.13 [Kbytes/sec] received
ErisDS commented 8 years ago

@alexellis The formatting of the results in that last reply is a bit confusing. It would be great if you could take a look and clarify it a little so it's possible to tell for sure what the results are saying.

From what you've said, it sounds like the results showed a performance decrease between node v0.12 and node v4.2, but it's not 100% clear to me. If the performance issue is happening when you're changing node, rather than when changing Ghost versions that suggests there's little we can do to make any difference here with the Ghost source. Perhaps it's worth looking for similar issues in the node repo, or posting your results there as the audience on that repo is more likely to have ideas :)

If you do post on the node repo, drop a link in this thread so we can follow along.

alexellis commented 8 years ago

@ErisDS This is the standard output from ApacheBench which shows a good estimate of performance in the Requests per second column and to a lesser extent: Transfer rate column. Please focus on this part of the result. I have also trimmed the other input to help you read it.

As a recap on results:

Node 0.12.7 with Ghost 0.7.3 vs Ghost 0.6.x - results are equivalent, 0.7.3 may even be quicker. Node 4.x with Ghost 0.7.3 results are poor

I think the Ghost blog clear has a very strong use-case for supporting ARM as a platform given its growing popularity in the data-center/cloud. The LTS version of node is now 4.x and initial testing has shown a notable drop in performance.

In order to provide a good experience on ARM it is worth investigating further. I know there are issues about using JS-based crypto in the login process, but these tests I have done are hitting the root/home-page.

I think most people use Ghost because it's lightweight, but 7 responses per second from a quad-core machine with 1GB RAM for essentially a roll-up page from SQL-lite seems slow. While Ghost may be equivalent in speed or faster in 0.7.x - it does obviously have some sort of performance issue with 4.x that needs looking into.

ErisDS commented 8 years ago

My confusion was largely around the sectioning and the fundamental meaning of the results not being 100% clear E.g I think this is meant to say "Ghost 0.7.3 on Node 4.2.3?":

0.12 with Ghost 4.2.3


When it comes to trying to debug the performance drop, there are a couple of relatively straight forward tests I would recommend:

halfdan commented 8 years ago

@alexellis I think the confusing part was that you wrote 0.12 with Ghost 4.2.3 and then that you are changing the two variables you are testing - the Ghost version and the node version. Where are you running Apache Bench from (on the RPi or a separate computer?). It is also confusing that the response length changes from 8100 bytes to almost half - 4334 bytes.

To get somewhere with this discussion, what we need is:


I think the Ghost blog clear has a very strong use-case for supporting ARM as a platform given its growing popularity in the data-center/cloud

Source?

but 7 responses per second from a quad-core machine with 1GB RAM

"quad-core" has very different performance implications on different architectures. A 900MHz quad-core ARM Cortex-A7 (which the RPi 2 Model B comes with) cannot really be compared to an i5/i7 quad-core (which also has a much higher clock rate) given that there's hyperthreading.

alexellis commented 8 years ago

@halfdan

https://news.ycombinator.com/item?id=9309459

http://techcrunch.com/2014/11/13/online-labs-designed-its-own-arm-servers-to-take-on-aws-digitalocean/

http://www.datacenterknowledge.com/archives/2014/11/19/french-web-host-builds-bare-metal-arm-server-cloud/

http://www.datacenterknowledge.com/archives/2015/04/29/paypal-deploys-arm-servers-in-data-centers/

and the popularity of Raspberry PI.

With reference to the quad-core processor - I am not comparing this to a x86_x64 machine, but it should certainly be able to do better than the previous generation single-core ARMv6 PI 1.

I'll update the entry to say Node 0.12 with Ghost instead of 0.12 with Ghost etc to make this clearer.

I can try a tiny express app on node 0.12.9 and 4.x and let you know how this performs.

punkeel commented 8 years ago

:+1: :-/

ErisDS commented 8 years ago

Going to close this as this issue is pretty old now, and there was never any debugging done around this. Would be interested to see numbers with 0.11.0. We're going to be reworking Ghost really heavily over the next few months and we will focus on optimisation for MySQL + Ubuntu.