squizlabs / PHP_CodeSniffer

PHP_CodeSniffer tokenizes PHP files and detects violations of a defined set of coding standards.
BSD 3-Clause "New" or "Revised" License
10.68k stars 1.48k forks source link

Generic.Files.LineLength miscalculates the length of line containing multibyte characters #3923

Open julien-picalausa opened 1 year ago

julien-picalausa commented 1 year ago

Describe the bug

Generic.Files.LineLength will report a line to be over the limit if the line is at limit and contains one or more multibyte characters. This is easily reproducible when having UTF-8 strings in the code base.

Using

echo "Jeg liker blåbærsyltetøy";

Custom ruleset

<?xml version="1.0"?>
<ruleset name="My Custom Standard">
  <description>If you are using a custom ruleset, please enter it here.</description>
  <rule ref="Generic.Files.LineLength">
      <properties>
          <property name="lineLimit" value="33" />
          <property name="absoluteLineLimit" value="33" />
      </properties>
  </rule>
</ruleset>

To reproduce

Steps to reproduce the behavior:

  1. Create a file called test.php with the code sample above...
  2. Run phpcs test.php ...
  3. See error message displayed
    1  | ERROR | [ ] Line exceeds maximum limit of 33 characters; contains 35 characters

Expected behavior

No error. The line only has 32 characters (even though it is 35 bytes long)

Versions (please complete the following information)

Operating System FreeBSD 13.1
PHP version PHP 8.2.12
PHP_CodeSniffer version PHP_CodeSniffer version 3.7.2 (stable)
Standard custom
Install type PHAR

Additional context

None I can think og

Please confirm:

oraxisart1 commented 1 year ago

Facing same issue on Manjaro Linux