vimeo / psalm

A static analysis tool for finding errors in PHP applications
https://psalm.dev
MIT License
5.56k stars 660 forks source link

[Severe] psalm type depends on if function is assigned to variable or nested, e.g. array_filter #8905

Open kkmuffme opened 1 year ago

kkmuffme commented 1 year ago

https://psalm.dev/r/38951b0c35

The trace should be same in both cases (the 2nd one is correct, the first one is wrong).

I am not sure if this is limited to array_filter or a generic bug, but it seems other functions are affected too.

This is a pretty huge issue, since the types are completely wrong in very basic code.

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/38951b0c35 ```php $bar */ $strings = array_filter( array_keys( $bar ), 'is_string' ); /** @psalm-trace $strings */; $keys = array_keys( $bar ); $strings = array_filter( $keys, 'is_string' ); /** @psalm-trace $strings */; ``` ``` Psalm output (using commit ef02ded): INFO: Trace - 7:29 - $strings: array INFO: Trace - 11:29 - $strings: array ```
orklah commented 1 year ago

This is a known limitation of Psalm currently.

Psalm can only apply assertions (restrict types based on a constraint) on symbols that have a VarId, think variables, properties, constants, some pure functions (https://github.com/vimeo/psalm/blob/3a298d028e949516920cdd54204cf0d98066d14f/src/Psalm/Internal/Analyzer/Statements/Expression/ExpressionIdentifier.php#L101)

Right now, it's unable to do assertions on array_keys( $bar ), but it can on $keys.

In fact, this is working well: https://psalm.dev/r/3f6e548fbb

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/3f6e548fbb ```php $bar */ $strings = array_filter( $keys = array_keys( $bar ), 'is_string' ); /** @psalm-trace $strings */; ``` ``` Psalm output (using commit ef02ded): INFO: Trace - 7:29 - $strings: array ```
weirdan commented 1 year ago

@orklah that's not the problem @kkmuffme reported. The return type of array_filter() changes depending on whether we operate on a variable or an expression - and that's the problem.

orklah commented 1 year ago

Yeah, but the cause of the problem is that Psalm can't process the is_string on the expression because it doesn't have a VarId.

The $assertions here is empty only for the function call: https://github.com/vimeo/psalm/blob/d63da1f66e87ef3e1e8b54fa5bfc31427bcdad1c/src/Psalm/Internal/Provider/ReturnTypeProvider/ArrayFilterReturnTypeProvider.php#L193

and also the extendedVarId just after.

Psalm would need both to be able to calculate the result of array_filter

weirdan commented 1 year ago

Psalm would need both to be able to calculate the result of array_filter

This means we're mixing concerns. The assertion on the parameter cannot be generated when the parameter is an expression. It's ok. But the parameter type is known, so we should be able to calculate a more precise return type.

This is a pretty huge issue, since the types are completely wrong in very basic code.

The types are not completely wrong (they do not contradict any value that this code might generate at runtime), they are just not as tight as they could be.

orklah commented 1 year ago

This means we're mixing concerns. The assertion on the parameter cannot be generated when the parameter is an expression. It's ok. But the parameter type is known, so we should be able to calculate a more precise return type.

Yeah, but that's the same motor behind if you can't reconcile string|int with is_string with then you can't get rid of int in the return type of array_filter

I'm not sure if I'm clear, what I'm saying is that improving the assertion system will make this error go away pretty much naturally

danog commented 1 year ago

This issue is generally a lot worse than what is described here, the inability to make assertions about anything other than variables, array accesses and simple method calls (iff they're passed on parameters and they're pure or the memoize config option is enabled) means even basic literal paradoxes can't be detected: https://psalm.dev/r/1933835130

A fix would require adapting getExtendedVarId to support generating keys for all expressions, not just array access and pure method calls.

At the very least, it should be adapted to also support pure function calls.

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/1933835130 ```php
danog commented 1 year ago

Consequently, descendant assertions removal logic should also be adapted to something a bit more precise than the current preg_match on variable names of variables/the root of an array access

danog commented 1 year ago

I discussed this with @orklah a while ago, a possible alternative solution I proposed was storing types directly in the AST objects instead of vars_in_scope, keeping a (copy-on-write?) stack of types in each statement that mimics the current logic that clones the vars_in_scope array every time the context is cloned.

kkmuffme commented 1 year ago

A fix would require adapting getExtendedVarId to support generating keys for all expressions, not just array access and pure method calls.

Which would be the "simple" temporary fix, I guess - just calculating a hash of the array_keys( ... ) string (+ maybe some additional identifiers to not get false positives) and using that as variable. Which is essentially like doing this (which fixes the problem): https://psalm.dev/r/37d8944e8f

$strings = array_filter( $temp = array_keys( $bar ), 'is_string' );
psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/37d8944e8f ```php $bar */ $strings = array_filter( array_keys( $bar ), 'is_string' ); /** @psalm-trace $strings */; $strings = array_filter( $temp = array_keys( $bar ), 'is_string' ); /** @psalm-trace $strings */; $keys = array_keys( $bar ); $strings = array_filter( $keys, 'is_string' ); /** @psalm-trace $strings */; ``` ``` Psalm output (using commit ef02ded): INFO: Trace - 7:29 - $strings: array INFO: Trace - 10:29 - $strings: array INFO: Trace - 14:29 - $strings: array ```