php-gettext / Gettext

PHP library to collect and manipulate gettext (.po, .mo, .php, .json, etc)
MIT License
687 stars 134 forks source link

Problem with new lines and PO files #218

Open vaites opened 5 years ago

vaites commented 5 years ago

We found a problem with new lines and PO files possibly related with #192. Here's an example code:

use Gettext\Translations;

$translations = new Translations;
$translations->setLanguage('en');

// add a translation with \n 
$translation1 = $translations->insert(null, 'test1');
$translation1->setTranslation("Lorem\nipsum");

// add a translation with \r\n 
$translation2 = $translations->insert(null, 'test2');
$translation2->setTranslation("Lorem\r\nipsum");

$translations->toPoFile('test.po');

If you open the po file with PoEdit the second translation is not processed and shows as empty (no error thrown) and other tools like PhpStorm shows an error (Msgid keyword expected). If you convert \r to its representation like \t or \n these tools works ok.

In the other hand, if you compile the file to mo with msgfmt command there's no error and this library parses it without problems, keeping the \r\n. This means that gettext doesn't care about the format of line breaks, but it does cause problems with certain tools.

I think \r must be added to Gettext\Generators\Po::convertString() to propertly scape it but don't know enough about gettext to understand the side effects of this change.

There are more options... Should this library unify the format of line breaks to avoid problems with certain tools or at least warn if a string is added using line breaks different from the existing ones? O maybe an option to save to po files?. I'm not sure, because I understand that this library (as gettext does) doesn't care about the contents...

To conclude, gettext uses only \n to detect new lines:

A string goes multi-line if and only if it has embedded newlines, that is, if it matches ‘[^\n]\n+[^\n]’

oscarotero commented 5 years ago

Hi, thanks for the detailed description.

I'm favor to include \r in Gettext\Generators\Po::convertString() and even normalize the line breaks to \n in order to be compatible with all tools.

What do you think? Do you like to work in a PR?

vaites commented 5 years ago

Well, I think the best solution is to include \r as we said, and then add an option (not enabled by default, but documented) to normalize de line breaks (I can work on it). You think that's reasonable?

oscarotero commented 5 years ago

Sounds good to me. Thanks!