iRail / stations

A list of all the Belgian stations and their properties used within the iRail project
http://irail.be/stations/NMBS
29 stars 20 forks source link

Use APC as in-memory store for extreme performance gains. #111 #112

Closed Bertware closed 6 years ago

Bertware commented 7 years ago

As tested in #111, in-memory caching provides serious performance gains.

This PR includes APC support for getStations and getStationById. Results of both functions are cached. This results in extreme performance gains up to 99%. This might have serious (positive) consequences for projects depending on this library.

See issue #111 for an in-depth review of multiple possible solutions and why APC was chosen.

This PR will require APC available on the server. APC keys have a prefix to prevent key collisions.

Testing requires the argument '-d apc.enable_cli=1', or apc.enable_cli should be set in php.ini.

Bertware commented 7 years ago

Can someone assist me with this error? Seems a build issue. Tested and working on Ubuntu.

composer --version
Composer version 1.3.1 2017-01-07 18:08:51

php -v
PHP 5.6.15-1+deb.sury.org~trusty+1 (cli) 
Copyright (c) 1997-2015 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2015 Zend Technologies
    with Zend OPcache v7.0.6-dev, Copyright (c) 1999-2015, by Zend Technologies
    with Xdebug v2.4.0, Copyright (c) 2002-2016, by Derick Rethans
    with blackfire v1.6.0, https://blackfire.io, by Blackfireio Inc.

phpunit --v
PHPUnit 5.7.5 by Sebastian Bergmann and contributors.
Bertware commented 7 years ago

@pietercolpaert can you test on a local machine? Of it works local for you too, it could be caused by a bug on the ci side, and restarting the job might resolve it. Should travis be updated for apc support? We also need to make sure all servers running this package have apc enabled to get the advantages.

Please see issue #88 for more required server changes, for even better performance gains.

pietercolpaert commented 7 years ago

@Bertware right now it looks like it won’t work when APC is not installed. Could we make it work when APC is not enabled as well? Otherwise I’m afraid the effort it would take to contribute something to the project would increase...

Bertware commented 7 years ago

We could use an If structure around all APC calls? If (extension_loaded('apc')) {...} Since PHP would never reach the code inside when APC is not installed, it wouldn't cause problems without APC. (Or at least, that's what I expect)

Bertware commented 7 years ago

Any chance the failing builds are caused by dependencies? According to the badge in the readme they are outdated.

pietercolpaert commented 7 years ago

PHPUnit failure on travis-ci

If you now merge with our master branch this should work again

Caching re-write needed

Let’s use a PSR-6 compliant tool so that we are able to abstract our caching systems. E.g., our server does not have APC enabled, but does have memcached. Using one of the packages listed on packagist we should be able to plug and play different caches when using different server set-ups. This one looks good: http://www.php-cache.com/en/latest/... What do you think?

Tests need to work also without cache

We should also allow, for development purposes, the tests to be ran on a system without cache at hand. Using PSR-6, this should be easy to implement.

Bertware commented 7 years ago

Caching is now implemented using PSR-6 (http://www.php-cache.com/en/latest/), using APC by default, but works without APC as well. Changing the cache is a one-line change. I will add some logic to choose the best cache. It's hard for me to test without APC, according to the CI tests, the library doesn't check if APC is available. I will add some manual tests for that. As noted in #111 I won't set memcached as a default. We should think in the long term, and should not settle for "any performance gain" when using APC is only a small module install. While memcached might be good "for now", APC will fulfill the needs way better and is more feature-proof, as it can handle quite some more requests every second than memcached can. Using APC over memcached will also improve customer satisfaction, as this part of the code uses quite some processing time.

Bertware commented 7 years ago

A fallback to arraycache was added. The jsonld file is now also cached. This way we can get higher reading speeds, even on stations that are requested for the first time.

pietercolpaert commented 7 years ago

Requested APC to be installed on our iRail servers to Skyscrapers

pietercolpaert commented 7 years ago

@Bertware APC is now unmaintained? http://php.net/manual/en/intro.apc.php → any idea?

MattiasGees commented 7 years ago

@pietercolpaert In this context you are talking about APCu and not APC.

Opcache is used to cache the php files itself. APCu is used to cache user actions like you are doing now.

MattiasGees commented 7 years ago

@pietercolpaert @Bertware APCu is installed on the servers. Can you guys test this?

pietercolpaert commented 7 years ago

Oh wow, thanks for the swift reply! I’ll test!

pietercolpaert commented 7 years ago

Something strange is going on:

On my laptop, without APCu

iRail/stations$ php vendor/bin/phpunit 
PHPUnit 5.7.9 by Sebastian Bergmann and contributors.

.........                                                           9 / 9 (100%)Testing 100 liveboards took an average of 1.4356398582458 ms for 1 liveboard, using Cache\Adapter\PHPArray\ArrayCachePool
Note: When using APC, use '-d apc.enable_cli=1' in the phpunit argument to test with APC enabled.

Time: 277 ms, Memory: 10.00MB

OK (9 tests, 26 assertions)

On the server, with APCu

stations $ php vendor/bin/phpunit -d apc.enable_cli=1
PHPUnit 4.8.34 by Sebastian Bergmann and contributors.

.........Testing 100 liveboards took an average of 1218.4139490128 ms for 1 liveboard, using Cache\Adapter\Apc\ApcCachePool
Note: When using APC, use '-d apc.enable_cli=1' in the phpunit argument to test with APC enabled.

Time: 2.04 minutes, Memory: 10.75MB

OK (9 tests, 26 assertions)

So I can confirm APCu is installed, but it certainly does not bring the expected performance gain... On the server right now, it must be noted, we are at this moment running PHP 5.5.9.

Bertware commented 7 years ago

Hmmm, I get serious improvements on either PHP version... Do you get the performance improvements at home? You could try adding debug prints for cache hit/miss to check if its hitting. Any chance the server is using swap disk space instead of ram?

I added the entire library. This doesn't cause any overhead (only a few kB disk space) but makes all adapters plug and play, instead of users having to add their own adapter each time.

pietercolpaert commented 7 years ago

Nope, not using swap

$ free -h
             total       used       free     shared    buffers     cached
Mem:          3.9G       2.8G       1.1G        52M       244M       1.3G
-/+ buffers/cache:       1.2G       2.6G
Swap:         511M         0B       511M
Bertware commented 7 years ago

Hmmmm, can you test the server with and without apcu? Leaving out -d apc.enable_cli=1 will let all cache requests fail (without errors), or you could set it to arrayadapter in the createCachePool method. APC, or any caching, should provide a speed boost in relative numbers (so it should halve the original time, or better), and it shouldn't be hardware related. It's cutting out a whole lot of calculations, so if the caching works, it can't but reduce execution time. Seems config related for php?

Maybe related: http://serverfault.com/questions/360345/apc-strange-long-responses

pietercolpaert commented 7 years ago

It takes as long without the caching code (copied your perf test to the master branch):

Time: 2.06 minutes, Memory: 10.75MB

Because of the old PHP version I had to use PHPUnit4 instead of PHPUnit 5. Testing this again on my laptop with an older phpunit however does not result in a slower execution time.

Maybe just the old PHP version in general is to blame?

@MattiasGees Should we update our PHP versions? Would this be difficult?

Bertware commented 7 years ago

@pietercolpaert look at this: http://stackoverflow.com/questions/12887160/apc-serialization-slow I believe it's some sort of configuration issue or bug, but if storing takes this long (15 sec in the question), it would explain the high averages. Can you try adding the serialize/deserialize function from php? (2 lines should change for this, 1 in setCache and 1 in getCache in stations.php)

Edit: The performance gain of serialize/deserialize seems to be platform dependent

Edit2: Not sure how I missed it, but to me, same performance means all cache misses. Can you try adding prints for cache miss and cache hit? Lines 58 and 60 in stations.php

pietercolpaert commented 7 years ago

@Bertware But I now think it’s something else that makes this slow. It gives the same times both with and without APC...

Bertware commented 7 years ago

See my second edit, I believe it all misses, causing it to be as slow as it was without caching. A PHP upgrade would have a positive impact, maybe a little when upgrading to 5.6, and quite some when upgrading to 7.0. Some other causes of slow code could be concurrency/locking, or disk I/O (though that would only be in case the disk usage goes anywhere near 90+%).

Bertware commented 6 years ago

Something seems wrong with the CI as local installation works fine? (Local install loads from cache)

pietercolpaert commented 6 years ago

Congrats! Merging this PR after it has been open for almost 9 months! ;-) Happy “not a father”’s day!