Open mattacosta opened 7 years ago
Aha! Good catch. Now I understand why the built-in tokenizer produces ArrayCast/BoolCast/etc. Tokens.
We should be able to just remove the call to parseCastExpressionGranular (which I had initially kept around to remain backwards compatible with the old lexer, and potentially to provide some nicer error handling in the future, but now we can just remove it.) https://github.com/Microsoft/tolerant-php-parser/blob/master/src/Parser.php#L1459
Also, this would make a good first PR
Another option is to keep all these tokens, and parse like you're doing, but give a message stating that there can't be anything between the (
type
and )
.
Or, your parser can check the tokens as it is parsing, and only accept this production if there is no trivia between them.
--
Note C# has this for when it sees >>
. Those are two separate tokens (so they can be used in List<List<int>>
). But in an expression context it gets accepted as 'right shift' when there is no trivia between the tokens. i.e. we would not accept a >/**/> b
.
<?php
define('int', 'i am a string');
echo (/**/int); // prints "i am a string"
There can't be a message because the snippet above is valid code. So if the code were modified to add:
$a = ' really';
echo (/**/int)$a;
...then the error should be an unexpected variable, not an invalid cast error.
Since the parser uses PHP's own tokenizer, the second case would already return a cast token as well, so it's forced into the correct production.
Cast expressions may not contain anything except spaces and tabs between the parentheses.
In the following example, the parser generates a CastExpression while PHP sees it as a constant named 'int' followed by a variable.
Example:
<?php (/* hello */int)$a;
Actual result: No error.
Expected result: syntax error, unexpected $a (T_VARIABLE)