PHPCSStandards / PHP_CodeSniffer

PHP_CodeSniffer tokenizes PHP files and detects violations of a defined set of coding standards.
BSD 3-Clause "New" or "Revised" License
805 stars 47 forks source link

Array destructuring in foreach - conflict between SpaceBeforeComma & NoSpaceAfterComma #527

Open VladaHejda opened 1 month ago

VladaHejda commented 1 month ago

Describe the bug

Destructuring an array for foreach - foreach ($data as [, , $value]) { - violates the Squiz.Arrays.ArrayDeclaration.SpaceBeforeComma ("Expected 0 spaces between "," and comma; 1 found") But when "fixed" - foreach ($data as [,, $value]) { - the Squiz.Arrays.ArrayDeclaration.NoSpaceAfterComma is violated ("Expected 1 space between comma and ","; 0 found") So this code cannot be written while using both sniffs together.

Code sample

class Foo
{
    /**
     * @param list<array{int, int, int}> $data
     */
    public function bar(array $data): int
    {
        $sum = 0;
        foreach ($data as [, , $value]) {
            $sum += $value;
        }

        return $sum;
    }
}

Versions

Operating System Windows 11
PHP version 8.3
PHP_CodeSniffer version 3.10.1
Install type Composer
jrfnl commented 1 month ago

@VladaHejda Thank you for reporting this. This is due to the sniff incorrectly acting on short lists.

The Squiz.Arrays.ArrayDeclaration sniff is known to be problematic and has been for years.

I've been building up a (highly configurable) NormalizedArrays standard in PHPCSExtra to replace it, but that's not complete yet. Would be lovely if I could find some time to finish that at some point.

This specific case does sound like a relatively simple fix though, so if someone would submit a PR with a small/simple fix for this specific issue, I will accept it, but other than that, this sniff is out of bounds for large fixes/rewrites as it is just not worth the time.

rodrigoprimo commented 3 weeks ago

This specific case does sound like a relatively simple fix though, so if someone would submit a PR with a small/simple fix for this specific issue, I will accept it, but other than that, this sniff is out of bounds for large fixes/rewrites as it is just not worth the time.

@jrfnl, I spent some time checking what is causing this issue and how to fix it. Initially, I thought the idea was to bail early when the T_OPEN_SHORT_ARRAY token represents a short list and not an array. But it seems that doing that without #12 or IsShortArrayOrListWithCache::isShortArray() from PHPCSUtils is not trivial.

If the above is correct and I'm not missing something, the other solution that occurred to me is to bail early when the T_OPEN_SHORT_ARRAY token is inside a foreach statement. I believe this means that the token represents a short list, as I cannot think of a scenario where it would be an array. I can think of two approaches to verify this:

Is this more or less what you had in mind?

Thanks!

jrfnl commented 3 weeks ago

Initially, I thought the idea was to bail early when the T_OPEN_SHORT_ARRAY token represents a short list and not an array. But it seems that doing that without #12 or IsShortArrayOrListWithCache::isShortArray() from PHPCSUtils is not trivial.

Correct. Filtering out short lists completely for this sniff is not on the table due to the complexity.

... the other solution that occurred to me is to bail early when the T_OPEN_SHORT_ARRAY token is inside a foreach statement. I believe this means that the token represents a short list, as I cannot think of a scenario where it would be an array.

That was the idea yes, but there is a scenario when a T_OPEN_SHORT_ARRAY could still be an array:

foreach ([$a, $b] as $c) {}

I can think of two approaches to verify this:

  • Check if the non-empty token before the T_OPEN_SHORT_ARRAY is T_AS.
  • Check if the T_OPEN_SHORT_ARRAY has the nested_parenthesis array and then check if the parenthesis owner of the last parenthesis is a T_FOREACH token (I still have to check if it is the last or the first element of the nested_parenthesis array that should be used in this case).

Sort of... The second check should come first. The first check after that, but is incorrect/not precise enough as outlined above. There may still be a key between the short list and the as keyword, i.e. https://3v4l.org/b5qm7, so you need to find the stack position of the as keyword between the open/close parentheses of the foreach and then you'll need to compare the position of the T_OPEN_SHORT_ARRAY token with the position of the T_AS.

Note: even with this additional check, this can still lead to false positives/negatives (short list as a value in an array before the as / short array as a list key in a short list after the as), but it should cover the most commonly used code pattern sufficiently.

Does this help ?