The goal is to provide a benchmark suite, testing something representative of real-world situations. This suite also includes some unrealistic microbenchmarks - comparing the results of these is fairly pointless, however they can still be useful to profile, to find optimization opportunities that may carry over to a real site.
This script configures and runs nginx, siege, and PHP5/PHP7/HHVM over FastCGI, over a TCP socket. Configuration is as close to identical as possible.
The script will run 300 warmup requests, then as many requests as possible in 1 minute. Statistics are only collected for the second set of data.
We don't have anything to share yet - we want to standardize and document how the interpreters are built/installed first.
As a regular user:
hhvm composer.phar install # see https://getcomposer.org/download/
hhvm perf.php --wordpress --php5=/path/to/bin/php-cgi # also works with php7
hhvm perf.php --wordpress --php=/path/to/bin/php-fpm # also works with php7
hhvm perf.php --wordpress --hhvm=/path/to/hhvm
Running with --hhvm gives some additional server-side statistics. It is usual for HHVM to report more requests than siege - some frameworks do call-back requests to the current webserver.
:heavy_exclamation_mark: If you run with a debug build you may hit timeouts and other issues.
If you want to run multiple combinations:
hhvm composer.phar install # see https://getcomposer.org/download
hhvm batch-run.php < batch-run.json > batch-run-out.json
See batch-run.json.example to get an idea of how to create batch-run.json.
On Ubuntu you can run scripts/setup.sh. This should provision your machine with everything you need to begin running the benchmarks.
This installs:
I've been using the current versions available from yum on Centos 6.3. HHVM is required as this is written in Hack.
Siege 3.0.x is not supported; as of writing, all 3.0.x releases have bugs that make it unable to correctly make the benchmark requests. 4.0.0, 4.0.1, 4.0.2 all automatically request resources on pages, and should not be used for benchmarking.
Unrealistic microbenchmarks. We do not care about these results - they're only here to give a simple, quick target to test that the script works correctly.
'hello, world' is useful for profiling request handling.
DISABLE_WP_CRON
is set to true to disable the auto-update and other requests
to rpc.pingomatic.com
and wordpress.org
.
WP_CRON
increases noise, as it makes the benchmark results include the time
taken by external sitesWP_CRON
jobs via cron or similar:URLs file is based on traffic to hhvm.com - request ratios are:
100: even spread over long tail of posts 50: WP front page. This number is an estimate - we get ~ 90 to /, ~ 1 to /blog/. Assuming most wordpress sites don't have our magic front page, so taking a value roughly in the middle. 40: RSS feed 5: Some popular post 5: Some other popular post 3: Some other not quite as popular post
The long tail was generated with:
<?php
for ($i = 0; $i <= 52; ++$i) {
printf("http://localhost:__HTTP_PORT__/?p=%d\n", mt_rand(1,52));
}
Ordering of the URLs file is courtesy of the unix 'shuf' command.
Aims to be realistic. Demo data is from devel-generate, provided by the devel module. Default values were used, except:
As well as the database dump, the static files generated by the above process (user images, images embedded in content) have also been included.
As above, aims to be realistic. Demo data is from the devel_generate
module
and default values were used, except:
The structure is similar to the Drupal 7 target, except for:
settings.php
, services.yml
, and setup.php
file.setup.php
file is used to pre-populate the Twig template cache so that
Repo Authoritative mode can be used.The upstream installation script provides an option to create demonstration data - this was used to create the database dump included here.
There are two unrealistic microbenchmarks:
Unrealistic microbenchmark: just the 'You have arrived' page from an empty installation.
Laravel 4 and 5 are both available.
The main page is the Barack Obama page from Wikipedia; this is based on the Wikimedia Foundation using it as a benchmark, and finding it fairly representative of Wikipedia. A few other pages (HHVM, talk, edit) are also loaded to provide a slightly more rounded workload.
perf.php can keep the suite running indefinitely:
hhvm perf.php --i-am-not-benchmarking --no-time-limit --wordpress --hhvm=$HHVM_BIN
You can then attach 'perf' or another profiler to the running HHVM or php-cgi process, once the 'benchmarking' phase has started.
There is also a script to run perf for you at the apropriate moment:
hhvm perf.php --i-am-not-benchmarking --wordpress --hhvm=$HHVM_BIN --exec-after-warmup="./scripts/perf.sh -e cycles"
This will record 25 seconds of samples. To see where most time is spent you can dive into the data using perf, or use the perf rollup script as follows:
sudo perf script -F comm,ip,sym | hhvm -vEval.EnableHipHopSyntax=true <HHVM SRC>/hphp/tools/perf-rollup.php
In order to have all the symbols from the the translation cache you
may need to change the owner of /tmp/perf-
TC-print will use data from perf to determine the hotest functions and translations. TC-print supports a number of built in perf counters. To capture all relevant counters, run the benchmark as follows: NOTE: perf.sh uses sudo, so look for the password prompt or disable it.
# Just cycles
hhvm perf.php --i-am-not-benchmarking --mediawiki --hhvm=$HHVM_BIN --exec-after-warmup="./scripts/perf.sh -e cycles" --tcprint
# All supported perf event types (Intel)
hhvm perf.php --i-am-not-benchmarking --mediawiki --hhvm=$HHVM_BIN --exec-after-warmup="./scripts/perf.sh -e cycles,branch-misses,L1-icache-misses,L1-dcache-misses,cache-misses,LLC-store-misses,iTLB-misses,dTLB-misses" --tcprint
# All supported perf event types (ARM doesn't have LLC-store-misses)
hhvm perf.php --i-am-not-benchmarking --mediawiki --hhvm=$HHVM_BIN --exec-after-warmup="./scripts/perf.sh -e cycles,branch-misses,L1-icache-misses,L1-dcache-misses,cache-misses,iTLB-misses,dTLB-misses" --tcprint
In order to have all the symbols from the the translation cache you
may need to change the owner of /tmp/perf-
We process the perf data before passing it along to tc-print
sudo perf script -f -F hw:comm,event,ip,sym | <HHVM SRC>/hphp/tools/perf-script-raw.py > processed-perf.data
If perf script is displaying additional fields, then re-run with -F <-field>,...
sudo perf script -f -F -tid,-pid,-time,-cpu,-period -F hw:comm,event,ip,sym | <HHVM SRC>/hphp/tools/perf-script-raw.py > processed-perf.data
tc-print is only built if the appropriate disassembly tools are available. On x86 this is LibXed. Consider building hhvm using:
cmake . -DLibXed_INCLUDE_DIR=<path to xed include> -DLibXed_LIBRARY=<path to libxed.a>
Use tc-print with the generated perf.data:
<HHVM SRC>/hphp/tools/tc-print/tc-print -c /tmp/<TMP DIR FOR BENCHMARK RUN>/conf.hdf -p processed-perf.data
Please see CONTRIBUTING.md for details.