moodlehq / moodle-cs

Moodle Coding Style
https://github.com/moodlehq/moodle-cs
GNU General Public License v3.0
18 stars 16 forks source link

BoilerplateCommentSniff causes invalid code if first line has comment #174

Closed micaherne closed 4 months ago

micaherne commented 4 months ago

The fix in BoilerplateCommentSniff can cause invalid PHP if there is a comment in the first line.

This code:

<?php // Some comment on first line.

class test {}

becomes this:

<?php// This file is part of Moodle - https://moodle.org/
//
// Moodle is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Moodle is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Moodle.  If not, see <https://www.gnu.org/licenses/>.
// Some comment on first line.

class test {
}

Because there is no whitespace after the php tag it is not recognised. Also clearly the boilerplate should be on the line after the opening tag.

micaherne commented 4 months ago

I'm not able to assign myself to this issue but I'll try and make a patch for it since it's quite likely to be my code that is causing it.

stronk7 commented 4 months ago

Thanks for reporting this @micaherne , I've assigned the issue o you, let us know if you need anything else.

Ciao :-)

micaherne commented 4 months ago

What seems to be happening here is that the insertBoilerplate() method is actually inserting it in a way that fails the sniff so the moveBoilerplate() method is being called on the second round of fixing, and it's moving it to the wrong place.

As far as I can see this is to do with how the PHP tokeniser deals with the opening PHP tag. I wasn't aware of this but it seems that it grabs the first whitespace character after the <?php bit.

$codes = ['<?php', '<?php ', "<?php\n", '<?php   '];

foreach ($codes as $code) {
    echo '|' . implode('|', array_map(fn($t) => $t[1], token_get_all($code))) . "|\n";
}

gives

|<?php|
|<?php |
|<?php
|
|<?php |  |

The current code is assuming that the PHP tag content is either '<?php' or '<?php\n' so we need to check the content for whitespace and replace it if necessary.