`SlevomatCodingStandard\Helpers\TokenHelper#findPrevious()` is very expensive, and chokes on files with many `use` statements

Ocramius commented 5 months ago

I just analyzed a project where a file with tons of use statements easily slows down PHPCS in a quadratic way.

Specifically, the SlevomatCodingStandard\Helpers\TokenHelper#findPrevious() (relatively lightweight) is used in:

UseFromSameNamespaceSniff
MultipleUsesPerLineSniff
UseDoesNotStartWithBackspaceSniff

Along other code locations, like the SniffLocalCache and UseStatementHelper.

IT seems like PHP_CodeSniffer\Files\File#findPrevious() is extremely slow and complex, and iterates over the entire tokens of a file starting from the end: this obviously is problematic when the source code increases in size, and we're analyzing use statements.

I'm wondering whether we should try and optimize the TokenHelper in this project, by minimuizing #findPrevious() calls, or in PHP_CodeSniffer itself, by optimizing the main loop there :thinking:

/cc @MatteoBiagini

Here's a rough profile screenshot, to make this a bit more visible:

phpcs-xdebug-profile-sorted-by-self-execution-time

kukulich commented 5 months ago

It's probably possible to find some optimization here for isTraitUse(). However I think the best solution would be to split T_USE token to three tokens in PHPCS...

Ocramius commented 5 months ago

That would be a major-major BC break, no? :D

kukulich commented 4 months ago

I've just tried to optimize it in https://github.com/slevomat/coding-standard/commit/5cac9915560ebfc39b0f92272de591ad01bbc5cc

Ocramius commented 4 months ago

Awesome! Happy to report once I get a renovate downstream update 💪

slevomat / coding-standard

`SlevomatCodingStandard\Helpers\TokenHelper#findPrevious()` is very expensive, and chokes on files with many `use` statements #1657