vimeo / psalm

A static analysis tool for finding errors in PHP applications
https://psalm.dev
MIT License
5.49k stars 655 forks source link

glob wrong return type #9820

Closed kkmuffme closed 1 year ago

kkmuffme commented 1 year ago

https://psalm.dev/r/726b2ba5b5

Result is not list<non-empty-string> when the argument is empty string and no flags are present. (but should be array<never, never>) Glob needs a return type provider added to ensure the "non-empty-string" is not lost when a valid pattern is passed.

Bug added in https://github.com/vimeo/psalm/pull/9814 @Hanmac

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/726b2ba5b5 ```php INFO: UnusedVariable - 3:1 - $result is never referenced or the value is not used ```
Hanmac commented 1 year ago

Wrong, i didn't touch the return type

the Bug was added by your PR first

YOU need to make a Type Provider that still allows for empty string pattern

kkmuffme commented 1 year ago

The return type should be array<never, never>|false when you pass in an empty string though.

VincentLanglet commented 1 year ago

There is nothing related to the PR you linked, since

It's kinda tiring to see you crying about bug on three different thread with wrong understanding of bug and feature when you took bad shortcuts though.

Feel free to add a returntype provider if you need one, but currently the return type is perfectly valid.

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/422a56fc89 ```php INFO: UnusedVariable - 3:1 - $result is never referenced or the value is not used ```
kkmuffme commented 1 year ago

the false return type does also exist with a non empty string https://psalm.dev/r/422a56fc89

The "false" is not the issue

an empty array is a list with no elements. It's just not a non-empty-list.

In case of an empty string glob we can infer that the array is always empty (if there is no GLOB_NOCHECK flag)

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/422a56fc89 ```php INFO: UnusedVariable - 3:1 - $result is never referenced or the value is not used ```
Hanmac commented 1 year ago

The return type should be array<never, never>|false when you pass in an empty string though.

are you 100% sure that glob("", flags) is even able to return false? you probably need to check the source for it when you are doing your type provider

and yeah, you need to check for GLOB_NOCHECK

kkmuffme commented 1 year ago

This is what I just said a minute before you comment https://github.com/vimeo/psalm/issues/9820#issuecomment-1562809399

Hanmac commented 1 year ago

a little bit sandboxing:

https://onlinephp.io/c/88e1a

VincentLanglet commented 1 year ago

an empty array is a list with no elements. It's just not a non-empty-list.

In case of an empty string glob we can infer that the array is always empty (if there is no GLOB_NOCHECK flag)

So it's an improvement => feature. There is no bug currently. Just something which can be improved with a return type provider.

Therefore

Bug added in #9814 @Hanmac

was just unnecessary false accusation.

kkmuffme commented 1 year ago

Yes, this should be handled in the return type provider if you want to take it exactly. Honestly, a simple return type provider without checking the GLOB_NOCHECK would be good enough though.

kkmuffme commented 1 year ago

Yes, it's a bug: https://psalm.dev/r/4e94df6823 This array can NEVER have non-empty-string values, but the trace has them.

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/4e94df6823 ```php |non-empty-string> ```
othercorey commented 1 year ago

The special-case return types are a simple limitation of the callmap system.

The most recent change seems to align with the input allowed by php.

The callmaps do sometimes ignore very unlikely output (and maybe input), but not sure this falls into that category.

orklah commented 1 year ago

Hey guys, calm down a bit!

It's an old debate whether it's better to create a few false positive to prevent a lot of errors.

However, if the false positives are enough to make user raise issue in the few days following a change, we might have gone a bit too far.

So I'd rather go back to the initial state where Psalm does not raise errors that will confuse people first. If we can improve on that, even better

Please don't fight about this (frankly, I don't have time to read pages of debate to extract the bits of relevant information)

ygottschalk commented 1 year ago

How about adding glob to psalm/stubs/CoreGenericFunctions.phpstub?

Considering PHP-Versions 8.2.0 - 8.2.6, 7.4.33, 7.0.0 - 7.0.33, 7.1.0 - 7.1.33, 7.2.0 - 7.2.34, 7.3.0 - 7.3.33, 7.4.0 - 7.4.32, 8.0.0 - 8.0.28, 8.1.0 - 8.1.19 I would propose: https://psalm.dev/r/258bab2644 (note that array<never, never> should be used instead of list<never>, but: #9827)

With that stub and the fix for the regression (see https://psalm.dev/r/1db5aa2914) the callmap should look like:

'glob' => ['false|list{0?:string, ...<non-empty-string>}', 'pattern'=>'string', 'flags='=>'int-mask<GLOB_MARK, GLOB_NOSORT, GLOB_NOCHECK, GLOB_NOESCAPE, GLOB_BRACE, GLOB_ONLYDIR, GLOB_ERR>']

Is that a definition everyone agrees on?

I personally must say, that I do not agree with the assumption, that there are no empty pathnames. On some exotic system there might be a glob implementation in libc that does return an empty string as a pathname. If there might be the slightest chance of having a system which considers an empty pathname as valid (maybe even as a special pathname idk) meaning that glob('') and glob('*') could return an array containing an empty string, we should use this stub: https://psalm.dev/r/ce62bf2569 and this callmap:

'glob' => ['false|list<string>', 'pattern'=>'string', 'flags='=>'int-mask<GLOB_MARK, GLOB_NOSORT, GLOB_NOCHECK, GLOB_NOESCAPE, GLOB_BRACE, GLOB_ONLYDIR, GLOB_ERR>']

Until we all agree on a better type, I would kindly ask @orklah to revert the PR #9752 (and the improvement PR #9814 which did fixes some issues with the newly introduced typing but not all).

@kkmuffme mind creating a PR with the stub typing suggested (given you agree with that typing). Best would bea PR with the second version of the stub, because that is less strict and IMO better less strict than wrong.

We could vote in that PR for which version to choose, either accepting there might be empty-strings or assuming that will never happen.

psalm-github-bot[bot] commented 1 year ago

I found these snippets:

https://psalm.dev/r/258bab2644 ```php * @psalm-param P $_pattern * @psalm-param F $_flags * @psalm-return ( * P is '' * ? (F is int-mask * ? false|list * : false|list{0:''} * ) * : (P is non-empty-string * ? (F is int-mask * ? false|list * : false|list{0:non-empty-string, ...} * ) * : (F is int-mask * ? false|list{0?:string, ...} * : false|list{0:string, ...} * ) * ) * ) * */ function glob_ (string $_pattern, int $_flags = 0): array|false { return false; } /** @var int-mask */ $_maybeFlag = 0; $_emptyNoFlag = glob_( '' ); $_emptyNoCheckFlag1 = glob_( '', GLOB_MARK ); $_emptyNoCheckFlag2 = glob_( '' , GLOB_NOSORT | GLOB_NOESCAPE); $_emptyNoCheckFlag3 = glob_( '' , GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR); $_emptyCheckFlag1 = glob_( '' , GLOB_NOCHECK); $_emptyCheckFlag2 = glob_( '' , GLOB_NOCHECK | GLOB_MARK); $_emptyCheckFlag3 = glob_( '' , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE); $_emptyMaybeCheckFlag = glob_( '' , $_maybeFlag); $_nonEmptyNoFlag = glob_( 'a' ); $_nonEmptyNoCheckFlag1 = glob_( 'a' , GLOB_MARK ); $_nonEmptyNoCheckFlag2 = glob_( 'a' , GLOB_NOSORT | GLOB_NOESCAPE); $_nonEmptyNoCheckFlag3 = glob_( 'a' , GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR); $_nonEmptyCheckFlag1 = glob_( 'a', GLOB_NOCHECK); $_nonEmptyCheckFlag2 = glob_( 'a' , GLOB_NOCHECK | GLOB_MARK); $_nonEmptyCheckFlag3 = glob_( 'a' , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE); $_nonEmptyMaybeCheckFlag = glob_( 'a' , $_maybeFlag); /** @var string */ $string = ''; $_stringNoFlag = glob_( $string ); $_stringNoCheckFlag1 = glob_( $string , GLOB_MARK ); $_stringNoCheckFlag2 = glob_( $string , GLOB_NOSORT | GLOB_NOESCAPE); $_stringNoCheckFlag3 = glob_( $string , GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR); $_stringCheckFlag1 = glob_( $string , GLOB_NOCHECK); $_stringCheckFlag2 = glob_( $string , GLOB_NOCHECK | GLOB_MARK); $_stringCheckFlag3 = glob_( $string , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE); $_stringMaybeCheckFlag = glob_( $string , $_maybeFlag); /** @psalm-trace $_emptyNoFlag $_emptyNoCheckFlag1 $_emptyNoCheckFlag2 $_emptyNoCheckFlag3 $_emptyCheckFlag1 $_emptyCheckFlag2 $_emptyCheckFlag3 $_emptyMaybeCheckFlag $_nonEmptyNoFlag $_nonEmptyNoCheckFlag1 $_nonEmptyNoCheckFlag2 $_nonEmptyNoCheckFlag3 $_nonEmptyCheckFlag1 $_nonEmptyCheckFlag2 $_nonEmptyCheckFlag3 $_nonEmptyMaybeCheckFlag $_stringNoFlag $_stringNoCheckFlag1 $_stringNoCheckFlag2 $_stringNoCheckFlag3 $_stringCheckFlag1 $_stringCheckFlag2 $_stringCheckFlag3 $_stringMaybeCheckFlag */ ``` ``` Psalm output (using commit b99857c): INFO: Trace - 67:161 - $_emptyNoFlag: false|list INFO: Trace - 67:161 - $_emptyNoCheckFlag1: false|list INFO: Trace - 67:161 - $_emptyNoCheckFlag2: false|list INFO: Trace - 67:161 - $_emptyNoCheckFlag3: false|list INFO: Trace - 67:161 - $_emptyCheckFlag1: false|list{''} INFO: Trace - 67:161 - $_emptyCheckFlag2: false|list{''} INFO: Trace - 67:161 - $_emptyCheckFlag3: false|list{''} INFO: Trace - 67:161 - $_emptyMaybeCheckFlag: false|list{0?: '', ...} INFO: Trace - 67:161 - $_nonEmptyNoFlag: false|list INFO: Trace - 67:161 - $_nonEmptyNoCheckFlag1: false|list INFO: Trace - 67:161 - $_nonEmptyNoCheckFlag2: false|list INFO: Trace - 67:161 - $_nonEmptyNoCheckFlag3: false|list INFO: Trace - 67:161 - $_nonEmptyCheckFlag1: false|non-empty-list INFO: Trace - 67:161 - $_nonEmptyCheckFlag2: false|non-empty-list INFO: Trace - 67:161 - $_nonEmptyCheckFlag3: false|non-empty-list INFO: Trace - 67:161 - $_nonEmptyMaybeCheckFlag: false|list INFO: Trace - 67:161 - $_stringNoFlag: false|list{0?: string, ...} INFO: Trace - 67:161 - $_stringNoCheckFlag1: false|list{0?: string, ...} INFO: Trace - 67:161 - $_stringNoCheckFlag2: false|list{0?: string, ...} INFO: Trace - 67:161 - $_stringNoCheckFlag3: false|list{0?: string, ...} INFO: Trace - 67:161 - $_stringCheckFlag1: false|list{string, ...} INFO: Trace - 67:161 - $_stringCheckFlag2: false|list{string, ...} INFO: Trace - 67:161 - $_stringCheckFlag3: false|list{string, ...} INFO: Trace - 67:161 - $_stringMaybeCheckFlag: false|list{0?: string, ...} ```
https://psalm.dev/r/1db5aa2914 ```php , but 0 provided ```
https://psalm.dev/r/ce62bf2569 ```php * @psalm-param P $_pattern * @psalm-param F $_flags * @psalm-return ( * P is '' * ? (F is int-mask * ? false|list{0?:''} * : false|list{0:''} * ) * : (F is int-mask * ? false|list * : false|non-empty-list * ) * ) * */ function glob_ (string $_pattern, int $_flags = 0): array|false { return false; } /** @var int-mask */ $_maybeFlag = 0; $_emptyNoFlag = glob_( '' ); $_emptyNoCheckFlag1 = glob_( '', GLOB_MARK ); $_emptyNoCheckFlag2 = glob_( '' , GLOB_NOSORT | GLOB_NOESCAPE); $_emptyNoCheckFlag3 = glob_( '' , GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR); $_emptyCheckFlag1 = glob_( '' , GLOB_NOCHECK); $_emptyCheckFlag2 = glob_( '' , GLOB_NOCHECK | GLOB_MARK); $_emptyCheckFlag3 = glob_( '' , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE); $_emptyMaybeCheckFlag = glob_( '' , $_maybeFlag); $_nonEmptyNoFlag = glob_( 'a' ); $_nonEmptyNoCheckFlag1 = glob_( 'a' , GLOB_MARK ); $_nonEmptyNoCheckFlag2 = glob_( 'a' , GLOB_NOSORT | GLOB_NOESCAPE); $_nonEmptyNoCheckFlag3 = glob_( 'a' , GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR); $_nonEmptyCheckFlag1 = glob_( 'a', GLOB_NOCHECK); $_nonEmptyCheckFlag2 = glob_( 'a' , GLOB_NOCHECK | GLOB_MARK); $_nonEmptyCheckFlag3 = glob_( 'a' , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE); $_nonEmptyMaybeCheckFlag = glob_( 'a' , $_maybeFlag); /** @var string */ $string = ''; $_stringNoFlag = glob_( $string ); $_stringNoCheckFlag1 = glob_( $string , GLOB_MARK ); $_stringNoCheckFlag2 = glob_( $string , GLOB_NOSORT | GLOB_NOESCAPE); $_stringNoCheckFlag3 = glob_( $string , GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR); $_stringCheckFlag1 = glob_( $string , GLOB_NOCHECK); $_stringCheckFlag2 = glob_( $string , GLOB_NOCHECK | GLOB_MARK); $_stringCheckFlag3 = glob_( $string , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE); $_stringMaybeCheckFlag = glob_( $string , $_maybeFlag); /** @psalm-trace $_emptyNoFlag $_emptyNoCheckFlag1 $_emptyNoCheckFlag2 $_emptyNoCheckFlag3 $_emptyCheckFlag1 $_emptyCheckFlag2 $_emptyCheckFlag3 $_emptyMaybeCheckFlag $_nonEmptyNoFlag $_nonEmptyNoCheckFlag1 $_nonEmptyNoCheckFlag2 $_nonEmptyNoCheckFlag3 $_nonEmptyCheckFlag1 $_nonEmptyCheckFlag2 $_nonEmptyCheckFlag3 $_nonEmptyMaybeCheckFlag $_stringNoFlag $_stringNoCheckFlag1 $_stringNoCheckFlag2 $_stringNoCheckFlag3 $_stringCheckFlag1 $_stringCheckFlag2 $_stringCheckFlag3 $_stringMaybeCheckFlag */ ``` ``` Psalm output (using commit b99857c): INFO: Trace - 61:161 - $_emptyNoFlag: false|list{0?: ''} INFO: Trace - 61:161 - $_emptyNoCheckFlag1: false|list{0?: ''} INFO: Trace - 61:161 - $_emptyNoCheckFlag2: false|list{0?: ''} INFO: Trace - 61:161 - $_emptyNoCheckFlag3: false|list{0?: ''} INFO: Trace - 61:161 - $_emptyCheckFlag1: false|list{''} INFO: Trace - 61:161 - $_emptyCheckFlag2: false|list{''} INFO: Trace - 61:161 - $_emptyCheckFlag3: false|list{''} INFO: Trace - 61:161 - $_emptyMaybeCheckFlag: false|list{0?: ''} INFO: Trace - 61:161 - $_nonEmptyNoFlag: false|list INFO: Trace - 61:161 - $_nonEmptyNoCheckFlag1: false|list INFO: Trace - 61:161 - $_nonEmptyNoCheckFlag2: false|list INFO: Trace - 61:161 - $_nonEmptyNoCheckFlag3: false|list INFO: Trace - 61:161 - $_nonEmptyCheckFlag1: false|non-empty-list INFO: Trace - 61:161 - $_nonEmptyCheckFlag2: false|non-empty-list INFO: Trace - 61:161 - $_nonEmptyCheckFlag3: false|non-empty-list INFO: Trace - 61:161 - $_nonEmptyMaybeCheckFlag: false|list INFO: Trace - 61:161 - $_stringNoFlag: false|list INFO: Trace - 61:161 - $_stringNoCheckFlag1: false|list INFO: Trace - 61:161 - $_stringNoCheckFlag2: false|list INFO: Trace - 61:161 - $_stringNoCheckFlag3: false|list INFO: Trace - 61:161 - $_stringCheckFlag1: false|non-empty-list INFO: Trace - 61:161 - $_stringCheckFlag2: false|non-empty-list INFO: Trace - 61:161 - $_stringCheckFlag3: false|non-empty-list INFO: Trace - 61:161 - $_stringMaybeCheckFlag: false|list ```
ygottschalk commented 1 year ago

Actually the assumption of non-empty-list's in my stubs might be wrong considering GLOB_ERR.

kkmuffme commented 1 year ago

@ygottschalk

Is that a definition everyone agrees on?

0?:string, is a pain for me :-( I think a custom return type provider (like for dirname) would be better and then in that provider use your suggestion in the callmap/as fallback, if the first arg is of type "string" (or empty string literal). Bc then it would would work fine for all "non-empty-string" cases and all "string" cases, which would make everyone happy I guess.

If there might be the slightest chance of having a system which considers an empty pathname as valid

There can't be, bc if you check php-src, it will immediately return if the pattern is empty. So even if there were a file system that had this, PHP glob wouldn't be able to find the empty files.

orklah commented 1 year ago

Sorry, I did not read all of that.

Thanks for the summary @ygottschalk

Could someone make a PR with whatever you agree on so we have at least the good things?

Then if someone is willing, another PR to get to the last mile to get optimal results is welcome :)

ygottschalk commented 1 year ago

@kkmuffme

There can't be, bc if you check php-src, it will immediately return if the pattern is empty. So even if there were a file system that had this, PHP glob wouldn't be able to find the empty files.

What about using the pattern '*'? It is non-empty but can match an empty string... But with the information about the early return on empty patterns we can at least narrow those return types. I think that also is an argument for php-src to consider empty pathnames invalid and therefore I think we can also assume that.

But as I said, idk if there exists that kind of weird system. I am fine with not considering this valid, but I wanted to point out this possibility, so someone with more knowledge on that topic than me could step in and share his/her wisdom.

I think a custom return type provider would be better

Why? The callmap type I provided is a fallback for the stub (see psalm.dev links), which checks for pattern beeing '' or non-empty-string and checks the flags for including GLOB_NOCHECK and by that determines the return type. Is there any advantage we need in this case to have a return type provider?

Hanmac commented 1 year ago
$_emptyCheckFlag1 = glob_( ''  , GLOB_NOCHECK);
$_emptyCheckFlag2 = glob_( '' , GLOB_NOCHECK | GLOB_MARK);
$_emptyCheckFlag3 = glob_( '' , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE);
INFO: Trace - 67:161 - $_emptyCheckFlag1: false|list{''}
INFO: Trace - 67:161 - $_emptyCheckFlag2: false|list{''}
INFO: Trace - 67:161 - $_emptyCheckFlag3: false|list{''}

are you sure it can return list('') ? i tried to test it on the sandbox, it only ever returns false

some local docker tests:

php > $_emptyCheckFlag1 = glob( ''  , GLOB_NOCHECK);
php > $_emptyCheckFlag2 = glob( '' , GLOB_NOCHECK | GLOB_MARK);
php > $_emptyCheckFlag3 = glob( '' , GLOB_NOCHECK | GLOB_BRACE | GLOB_ONLYDIR | GLOB_ERR | GLOB_NOSORT | GLOB_NOESCAPE);
php > var_dump($_emptyCheckFlag1);
php shell code:1:
array(1) {
  [0] =>
  string(0) ""
}
php > var_dump($_emptyCheckFlag2);
php shell code:1:
array(1) {
  [0] =>
  string(0) ""
}
php > var_dump($_emptyCheckFlag3);
php shell code:1:
array(0) {
}
php >

i don't know what flag3 makes it different from the other ones to make it return empty array instead?

ygottschalk commented 1 year ago

GLOB_ERR

Would guess that is the one. GLOB_ERR - Stop on read errors (like unreadable directories), by default errors are ignored. (php.net)

Edit: Or it may be GLOB_ONLYDIR, needs testing...

Hanmac commented 1 year ago

GLOB_ERR

Would guess that is the one. GLOB_ERR - Stop on read errors (like unreadable directories), by default errors are ignored. (php.net)

Edit: Or it may be GLOB_ONLYDIR, needs testing...

my current testing shows exactly that:

ygottschalk commented 1 year ago

Current version in PR does assume that:

Current version does not respect GLOB_ERR which may lead to glob('/existing_dirname/', GLOB_NOCHECK | GLOB_ERR) to return an empty list on an read error, but psalm will type the return (current version of stub) as false|non-empty-list<non-empty-string>. Needs more information

Hanmac commented 1 year ago

i don't know if both flags collide: GLOB_NOCHECK | GLOB_ERR i tried to provoke an error that would return an empty list, but GLOB_NOCHECK doesn't let me

i even tried to read from a directory that doesn't exist (inside a broken symbolic link)

ygottschalk commented 1 year ago

@orklah If you can live with the made assumptions this is ready to merge, else please provide feedback so I can build up on that