overtrue / phplint

:bug: A tool that can speed up linting of php files by running several lint processes at once.
MIT License
984 stars 118 forks source link

faster process linter #197

Closed staabm closed 9 months ago

staabm commented 10 months ago

New Feature

Summary

since php 8.3 the php -l commandline supports passing multiple file paths at once - see https://github.com/php/php-src/pull/10710. this is way faster as linting a project one file at a time has a huge process creation/shutdown overhead.

would it be possible to somehow make use of this capability so the process linting in phplint could get more efficient?

llaville commented 10 months ago

Hello @staabm

Of course it's possible to implement it ! v9.1 that will drop support of PHP 8.0, and is still under development may beneficit of this new behavior.

I'm still working on another project https://github.com/llaville/php-compatinfo-db about PHP 8.3, but I'll work on this feature ASAP

llaville commented 10 months ago

Release 9.1.0 will come with this feature and is scheduled for December 9, 2023 !

llaville commented 10 months ago

Coding phase is finished (not yet available on repo) because I want to polish it again, and add some tests.

But performance are there :

I've just found a regression on display results with progress meter with PHP 8.1 and 8.2 related to this new code. BTW, I'm confident to publish a release tomorrow !

staabm commented 10 months ago

that sounds perfect. no need to rush. thanks for working on it.

llaville commented 10 months ago

Code is now available on repo (branch 9.1) with commit 63fdaecd40738456bb3646bc86580a305480f4fc

Sorry to said that finally the official stable release 9.1.0 won't be available today : I want to play with https://github.com/phpbench/phpbench project (that I've discovered today)

My first tries between PHP 8.1.26 an PHP 8.3.0 gave something like :

overview-9

We have a memory gain, and a little execution time with only the default options (--jobs=5)

Native run : time bin/phplint --no-cache --no-configuration vendor-bin/phpunit/vendor/phpunit/phpunit/src/ -j5 Gave me on PHP 8.3.0 platform with xdebug extension 3.3.0

Time: 8 secs, Memory: 10.0 MiB, Cache: 0 hit, 873 misses, Processes: 175

 [OK] 873 files

real    0m8.356s
user    0m12.096s
sys     0m4.067s

NOTE : Calc of Processes used is : total files / jobs ( 873 / 5 = 175 )

Now if you increase number of jobs in PHP 8.3 to run lint files all in once --jobs=873

Time: 1 sec, Memory: 10.0 MiB, Cache: 0 hit, 873 misses, Process: 1

 [OK] 873 files

real    0m1.129s
user    0m1.342s
sys     0m0.150s

The PHPBench overview report (chunk of it as screenshot) was generated with command :
phpbench run tests/Benchmark/ --report=overview --report=aggregate --ref=php81 --tag=php83

Code is not yet on repo ; Here are code (not yet official)

phpbench.json

 {
  "runner.bootstrap": "vendor/autoload.php"
}

tests/Benchmarch/LintBench.php

<?php

declare(strict_types=1);

/*
 * This file is part of the overtrue/phplint package
 *
 * (c) overtrue
 *
 * This source file is subject to the MIT license that is bundled
 * with this source code in the file LICENSE.
 */

namespace Overtrue\PHPLint\Tests\Benchmark;

use Overtrue\PHPLint\Command\LintCommand;
use Overtrue\PHPLint\Configuration\ConsoleOptionsResolver;
use Overtrue\PHPLint\Event\EventDispatcher;
use Overtrue\PHPLint\Finder;
use Overtrue\PHPLint\Linter;
use PhpBench\Attributes as Bench;
use Symfony\Component\Console\Input\ArrayInput;
use Throwable;

/**
 * @author Laurent Laville
 * @since Release 9.1.0
 */

class LintBench
{
    /**
     * @throws Throwable
     */
    #[Bench\OutputTimeUnit('milliseconds')]
    #[Bench\Iterations(5)]
    public function benchLintCommand(): void
    {
        $dispatcher = new EventDispatcher([]);

        $arguments = [
            'path' => [\dirname(__DIR__, 2) . '/vendor-bin/phpunit/vendor/phpunit/phpunit/src'],
            '--no-configuration' => true,
            '--no-cache' => true,
        ];
        $command = new LintCommand($dispatcher);
        $input = new ArrayInput($arguments, $command->getDefinition());
        $configResolver = new ConsoleOptionsResolver($input);

        $finder = new Finder($configResolver);
        $linter = new Linter($configResolver, $dispatcher);

        $linter->lintFiles($finder->getFiles());
    }
}
staabm commented 10 months ago

first let me say thank you for working on it.

Calc of Processes used is : total files / jobs ( 873 / 5 = 175 )

I think we could game this number a bit. 10 files per job should boost the perf a lot more

staabm commented 10 months ago

what do you think of https://gist.github.com/staabm/ec764d9476bbefe55fcda7744951ef62 ?

with this impl I get PHP 8.3

php bin/phplint vendor/ src examples --no-cache  2.24s user 1.33s system 553% cpu 0.646 total

with this impl I get PHP 8.2

php bin/phplint vendor/ src examples --no-cache  22.99s user 12.46s system 753% cpu 4.704 total

which looks like a factor of 7-8x :-)

llaville commented 10 months ago

I've not really explains how its works, and I understand why you tried with : https://gist.github.com/staabm/ec764d9476bbefe55fcda7744951ef62#file-linter-php-L124

We continue to use the --jobs option (default = 5) for all PHP versions.

While PHP 8.1 and 8.2 are not able to lint more than a file at same time, by using --jobs=10 (i.e) we authorize phplint to run 10 processes at same time with each one file to lint.

With PHP 8.3, it's a bit different, now we cant lint more than one file at same time. So if you specify --jobs=10 PHPLint will use only one Process with 10 files.

That is the reason why it runs faster if you know number of files to analyse, and you apply the right value with --jobs option.

On my use case the source code contains 873 files to analyse. Then with PHP 8.3 if I specify --jobs=873, I will analyse all files with one process.

See by example :

  1. time bin/phplint --no-cache --no-configuration --no-progress --jobs=10 vendor-bin/phpunit/vendor/phpunit/phpunit/src/
phplint 9.1.0-dev by overtrue and contributors.

Runtime       : PHP 8.3.0
Configuration : No config file loaded

Time: 4 secs, Memory: 10.0 MiB, Cache: 0 hit, 873 misses, Processes: 88

 [OK] 873 files

real    0m4.169s
user    0m5.983s
sys     0m2.070s
  1. time bin/phplint --no-cache --no-configuration --no-progress --jobs=873 vendor-bin/phpunit/vendor/phpunit/phpunit/src/
phplint 9.1.0-dev by overtrue and contributors.

Runtime       : PHP 8.3.0
Configuration : No config file loaded

Time: < 1 sec, Memory: 10.0 MiB, Cache: 0 hit, 873 misses, Process: 1

 [OK] 873 files

real    0m1.051s
user    0m1.295s
sys     0m0.131s
staabm commented 10 months ago

Thanks for the details.

With PHP 8.3, it's a bit different, now we cant lint more than one file at same time. So if you specify --jobs=10 PHPLint will use only one Process with 10 files.

I wonder why you only want a single process on PHP 8.3.

In my implementation we still use parallel processes and lint 10 files at once, which seems to be faster in comparison to the Fiber approach..?

llaville commented 10 months ago

Thanks for your code review !

I'll think again about it, and especially because I want to play again with PHPBench framework.

9.1.0 is still in dev and I really want a better version, so I'll take time to think and change code again if necessary.

staabm commented 10 months ago

Take your time.

Would it help you if I open a PR with my suggested change above?

llaville commented 10 months ago

Take your time.

Would it help you if I open a PR with my suggested change above?

Thanks but not really. Your gist file is enough !

I'm currently working on the benchmark tests implementation before to change the linter code

llaville commented 9 months ago

I've reached today, I hope and I think, a good milestone in new faster linter implementation.

After I first approach (see commit https://github.com/overtrue/phplint/commit/63fdaecd40738456bb3646bc86580a305480f4fc), and contribution by @staabm, I'm able to propose now a new implementation.

Results on my laptop :

I was able to produce PHPBench following reports, with two commands :

phplint_ phpbench_time html

phplint_ phpbench_memory html

phplint_ phpbench_table html

llaville commented 9 months ago

Code is available on repo (branch 9.1).

Verbose level 2 and 3 allow to debug process launched and ran with the new ProcessHelper component !

llaville commented 9 months ago

Additional Info: using --jobs option still try to run X jobs in parallel, but as a little difference with PHP 8.3

For example :

That means on PHP 8.3 we can lint 25 files, 100 files if all processes may be launched in parallel, while on PHP 8.1, we continue to lint 5 files, 10 files if all processes may be launched in parallel.

llaville commented 9 months ago

(@staabm, /cc @overtrue) Feedbacks about this second implementation are welcome ! Especially before an official release of version 9.1.0

staabm commented 9 months ago

I don't know all the internals and can't oversee the commits (having a PR would be easier for review).

A quick glance looked good so far.

the perf numbers looks sensible and like a solid improvement. I am looking forward to this since opening the initial issue on php-src :)

thank you!

llaville commented 9 months ago

Have a good new !

I was able to write an run a GitHub Workflow that generated PHPBench reports.

Here is a preview on my private repo :

Benchmark-GA

And new results when xdebug is not loaded are better, especially on PHP 8.3

PHPBench-Report_GA html

@overtrue As the Benchmark GitHub Workflow take a long time to be generated, I want your agreement first to add it to this repo. Or not if you don't think it's acceptable !

overtrue commented 9 months ago

@llaville Great work! just do it!

llaville commented 9 months ago

@overtrue Available on repo with commit https://github.com/overtrue/phplint/commit/9937b727d65a88909cae5b3ea5b8749813f7fa63

llaville commented 9 months ago

With a little clean-up, final version is : https://github.com/overtrue/phplint/blob/9.1/.github/workflows/benchmark.yml

We can run it, once branch 9.1 will be merged to main (workflow_dispatch limitation).

llaville commented 9 months ago

@overtrue FYI: I've put the benchmark workflow on main branch and run it to produce results.

Go to https://github.com/overtrue/phplint/actions/runs/7218594019 and download the PHPBench-Report. It's a zip file that contains only one file (index.html) HTML PHPBench report as you've seen previous contains only in screenshots

PS: it ran faster than in my private repo (13 min vs 23 min)