phacility / xhprof

XHProf is a function-level hierarchical profiler for PHP and has a simple HTML based user interface.
http://pecl.php.net/package/xhprof
Apache License 2.0
2.6k stars 925 forks source link

Too long TSC calibration #57

Open validname opened 9 years ago

validname commented 9 years ago

Xhprof finds out how many TSC ticks goes into 5 ms period. And it does it for each core. It lasts 120 ms on our 24-core server, much more than PHP takes executing the script! Maybe exists any way to do this calibration only for different physical CPUs (not all cores)?

beberlei commented 9 years ago

@validname See this fork of XHProf where I impelemented this https://github.com/QafooLabs/php-profiler-extension

validname commented 9 years ago

@beberlei Thank you! Actually, I've already wrote several dirty patches to improve XHprof performance on Linux. But it still eats a lot of CPU when enabled, so I'm trying to find another profiler. Do you have some measurements of your profiler's overhead?

beberlei commented 9 years ago

@validname The overhead is comparable to XHProf because it is a fork, but we implemented a bunch of ways to improve it:

  1. Use 'functions' => array('PDO::exec', ...) as a whitelist of methods to profile.
  2. Use qafoolabs_layers_enable(array('PDO::exec' => 'db', 'curl_exec' => 'http') which only profiles a list of functions and aggregates them by layer.
  3. Use the QAFOOPROFILER_FLAGS_NO_USERLAND or QAFOOPROFILER_FLAGS_NO_BUILTINS flags.

We are working on a slightly different approach to profiling that is much faster, but it is not nearly production ready yet.

Would you care to share your patches? I would be interested in the improvement ideas you had.

validname commented 9 years ago

@beberlei Sorry for long answer. My patches are in my fork: https://github.com/validname/xhprof/commit/0ede447b38c9d8f842b90426b225768e7248f678 and https://github.com/validname/xhprof/commit/c50540d7337fdb3e7bdf015679701ceac350ae64. They both are concerned with PHP extension performance, that was primary goal to improve it. But I'm not an experienced developer and hadn't dealt with cross-platform code compiling so my patches will probably work only with my couple of Linux/glibc and kernel. But we are using patched XHProf extension in our company now and have no problem except overhead (about 8-10% of average CPU usage with 50% of total CPU usage, that gives about 20% of overhead. But original extension works with 20-25% of overhead).