squizlabs / PHP_CodeSniffer

PHP_CodeSniffer tokenizes PHP files and detects violations of a defined set of coding standards.
BSD 3-Clause "New" or "Revised" License
10.66k stars 1.48k forks source link

PHP 8.1 | Tokenizer/PHP: hotfix for overeager explicit octal notation backfill #3579

Closed jrfnl closed 2 years ago

jrfnl commented 2 years ago

Follow up on #3481 and #3552. /cc @MarkBaker

While working on PHPCompatibility/PHPCSUtils, I found another instance where the explicit octal notation backfill is overeager.

PHP natively will tokenize invalid octals, like 0o91 as T_LNUMBER + T_STRING in all PHP versions, but with the backfill in place, this would no longer be the case and on PHP < 8.1, this would now be tokenized as T_LNUMBER, making tokenization across PHP versions unpredictable and inconsistent.

Fixed now. Including tests.

@gsherwood Similar to the previous fix, could this fix please still be merged into 3.7.0 to prevent this incorrect tokenization ending up in a released version ?

jrfnl commented 2 years ago

Darn.. more complicated than I hoped and difficult to test locally what with PHPUnit 7.x...

jrfnl commented 2 years ago

I think I've got it now. 🤞🏻

jrfnl commented 2 years ago

Okay, now I got it ;-)

I've now also added a second commit to improve the tests.

BackfillExplicitOctalNotationTest: refactor and improve the test

As the tokenizer may now split a T_STRING token which follows a T_LNUMBER into two tokens, it is important to also test that the next token is set correctly.

To that end, I've refactored the test to include testing this.

Includes:

jrfnl commented 2 years ago

And came up with another test case which needed accounting for. Fixed up in the existing commits.

gsherwood commented 2 years ago

Thanks so much for finding and fixing this. Especially the added changes to the tests to check for that next token.

jrfnl commented 2 years ago

Thanks. Glad I caught it before the release.