yiisoft / yii2

Yii 2: The Fast, Secure and Professional PHP Framework
http://www.yiiframework.com
BSD 3-Clause "New" or "Revised" License
14.24k stars 6.91k forks source link

Extracted translation strings must not be modified in any way #15566

Open PowerGamer1 opened 6 years ago

PowerGamer1 commented 6 years ago

Currently MessageController.php unescapes C-style escape sequences in the strings extracted from PHP source files. Under some scenarios this behavior may lead to Yii2 not finding translated strings that exist. For example:

Yii::t('x', "11\n11"); // string 1
Yii::t('x', "22\r\n22"); // string 2

The strings above will be unescaped and generate message file:

return [
    '11
11' => '',
    '22
22' => '',
];

The first string will be unescaped with LF line ending and the second string with CRLF line ending. When translator edits this to specify translation most text editors will convert different line endings into a single form (either LF or CRLF). Also by default normalization of line endings happens when such file is commited and later checked out from git. This will lead to either first or the second string missing translation during runtime (since in translated file one of the strings will have inside of it new line character sequence different from the sequence specified in Y::t()).

So MessageController.php must not modify the extracted translation strings in any way. Or, in other words, it must produce the following messages file:

return [
    "11\n11" => "",
    "22\r\n22" => "",
];

For that the call to stripcslashes() must be removed from MessageController::extractMessagesFromTokens() and the string must be enclosed into the same character (single or double quote) as the string in the original PHP source file from which it was extracted.

PowerGamer1 commented 6 years ago

Upon analyzing the code of MessageController::extractMessagesFromTokens() I encountered another much simpler case when invalid translation string is produced:

Yii::t('x', 'a\tb'); // Notice the single quotes - PHP does not expand \t in such string!

will generate messages file with \t character sequence replaced with TAB character:

return [
    'a    b' => '',
];

Of course, during runtime the translation for the original string 'a\tb' will not be found.

PowerGamer1 commented 6 years ago

Also, while not directly manifesting in non-working code, the solution for the following problem should be implemented.

Currently messages file generated by Yii2 will always have LF line endings outside the extracted strings regardless of environment (the line endings inside extracted strings will be the same as in the actual extracted string - otherwise translation would not be found - so this is correct). This leads to the following problem.

Suppose a file test.php has the content:

$a = 123;
Yii::t('x', 'First line.
Second line.');

Depending on git configuration/dev environment ALL line endings in test.php (including after $a = 123;) can be either LF or CRLF. Default/recommended git config is to auto-replace line endings to OS specific ones on checkout. I.e. upon checkout a dev1 on Linux will have LF line endings in test.php and a dev2 on Windows - CRLF.

When dev1 runs message extraction the Yii2 generates x.php with ALL line endings as LF. But when dev2 runs message extraction the Yii2 generates x.php with line endings inside string as CRLF and outside string as LF. While this produce working messages file (translations for dev2 will work), when dev2 later commits x.php git complains that the file has mixed line endings.

To work around this problem messages file should be generated with the same line endings outside the strings as the line endings used in user code in current environment (OS). Which line endings to use can be easily and automatically determined by checking if, for example, message config file has CRLF, like so:

$useCrLf = strpos(file_get_contents($configFile), "\r\n") !== false;

P.S. Note, that such a check cannot be based on PHP_EOL or PHP_OS since a git can be configured to checkout with LF under Windows or CRLF under Linux.