wp-cli / i18n-command

Provides internationalization tools for WordPress projects.
MIT License
96 stars 52 forks source link

Issues parsing PO files without empty lines #393

Closed liedekef closed 2 months ago

liedekef commented 4 months ago

Bug Report

Describe the current, buggy behavior

When using the normal wp-includes/pomo to create a pot-file, and then msgmerge and msgfmt, all pot/po and mo files are generated/merged as expected. However, when using "wp-cli" there are a number of issues:

The source language files in question are from my plugin and can be found here: https://github.com/liedekef/events-made-easy/tree/main/langs So it seems that some header is missing/required by the i18n-commands that isn't marked. I get no errors when creating the po/mo/php files (just the usual "Success: ..." output). I tried copying the headers from the nl_BE po-file to the fr-counterpart but to no avail.

Describe what you would expect as the correct outcome

Correctly generated po/mo/php files, or at least some error indicating the issue with the po-files.

Let us know what environment you are running this on

the phar version of "wp-cli" doesn't seem to support "info" as a subcommand, but I'm using the phar file found here:
https://raw.githubusercontent.com/wp-cli/builds/gh-pages/phar/wp-cli.phar
swissspidy commented 4 months ago

HI there, thanks for reaching out!

From what I can see after a quick test, the issue is that your PO files do not contain any empty lines between the strings, just empty comments.

Example:

#
# File: eme-actions.php, line: 343
# File: eme-events.php, line: 9395
msgid "An error has occurred"
msgstr "Es ist ein Fehler aufgetreten"
#
# File: eme-actions.php, line: 344
msgid "Clear"
msgstr "Leeren"
#
# File: eme-actions.php, line: 345
# File: eme-events.php, line: 9396
msgid "Mailing preferences"
msgstr "Mailing Präferenzen"
#
# File: eme-actions.php, line: 346
# File: eme-events.php, line: 9397
msgid "Yes, I'm sure"
msgstr "Ja, ich bin mir sicher"

Where it should be:

#
# File: eme-actions.php, line: 343
# File: eme-events.php, line: 9395
msgid "An error has occurred"
msgstr "Es ist ein Fehler aufgetreten"

# File: eme-actions.php, line: 344
msgid "Clear"
msgstr "Leeren"

# File: eme-actions.php, line: 345
# File: eme-events.php, line: 9396
msgid "Mailing preferences"
msgstr "Mailing Präferenzen"

# File: eme-actions.php, line: 346
# File: eme-events.php, line: 9397
msgid "Yes, I'm sure"
msgstr "Ja, ich bin mir sicher"

That seems to confuse our PO parser, which then ends up not extracting any strings—or well, just one, with all other strings being added as comments.

So how did you generate those PO files? 🤔

The headers say Poedit, but Poedit properly adds spaces. So did you modify them manually or something?

A quick way to fix this would be:

  1. Open PO file in Poedit
  2. Save file without any changes — this will re-save the PO file with proper spaces
  3. Close Poedit
  4. Then run wp i18n make-php

the phar version of "wp-cli" doesn't seem to support "info" as a subcommand, but I'm using the phar file found here:

It's wp cli info, not wp info.

liedekef commented 4 months ago

In fact most po-files come from the wordpress translation site (before I moved to github), the french version is maintained by another person (I get github pull requests) but I presume it is poedit too. It's not like I maintain all files :-) I tried to detect where the empty-line removal comes from and it seems it is msgmerge doing this. To replicate:

So I'm assuming msgmerge knows what it does and the empty lines are not a requirement :-)

Concerning "wp cli info" ==> I got confused by the "wp cli" part, but it seems "cli" is also a subcommand :-) Obfuscated output:

OS: Linux 6.6.8-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Dec 21 04:01:49 UTC 2023 x86_64
Shell:  /bin/bash
PHP binary: /usr/bin/php
PHP version:    8.2.14
php.ini used:   /etc/php.ini
MySQL binary:   /usr/bin/mysql
MySQL version:  mysql  Ver 15.1 Distrib 10.5.23-MariaDB, for Linux (x86_64) using  EditLine wrapper
SQL modes:  
WP-CLI root dir:    phar://wp-cli.phar/vendor/wp-cli/wp-cli
WP-CLI vendor dir:  phar://wp-cli.phar/vendor
WP_CLI phar path:   /xxxx/git/events-made-easy
WP-CLI packages dir:    
WP-CLI cache dir:   /xxxxx/.wp-cli/cache
WP-CLI global config:   
WP-CLI project config:  
WP-CLI version: 2.10.0
swissspidy commented 4 months ago

Hmm I see, interesting!

Well, I'll need to check whether the whitespace thing can be easily resolved in the third-party gettext library we're using, or if we can work around it somehow.

In the meantime, I suggest using the wp i18n update-po command instead of msgmerge. It does the same thing and should work fine without removing empty lines.

ernilambar commented 2 months ago

I dug little deep into the plugin provided by the OP then I found that the custom script to generate POT file https://github.com/liedekef/events-made-easy/blob/main/langs/gettextize.sh.orig is actually eating the blank line between the translation entry.

General standard for translation entry is following which expects one blank line between translation entry and this is been followed by all PO/POT generators.

white-space
#  translator-comments
#. extracted-comments
#: reference…
#, flag…
#| msgid previous-untranslated-string
msgid untranslated-string
msgstr translated-string

So I doubt upstream gettext library will update to parse the POT file which does not adhere the standard.

liedekef commented 2 months ago

Sorry, but this is not correct. I explained the steps to reproduce above (without my shell script), but will repeat them:

-     poedit events-made-easy-nl_BE.po and save ==> the empty lines are there
-     mv events-made-easy-nl_BE.po tmp.po
-     msgmerge --strict -o events-made-easy-nl_BE.po tmp.po events-made-easy.pot ==> in the newly merged events-made-easy-nl_BE.po the empty lines are gone

No use of any script, just msgmerge (binary provided by the gettext rpm on fedora). I can reproduce this every time. Also, from the doc at gnu: https://ftp.gnu.org/old-gnu/Manuals/gettext-0.11.2/html_node/gettext_9.html : "Entries begin with some optional white space" ==> so the white space is optional (and apparently msgmerge doesn't use it)

swissspidy commented 2 months ago

As per my previous comment, I wanted to look into it a bit more. This was a nice little push now :)

I just submitted a PR upstream to the Gettext library (https://github.com/php-gettext/Gettext/pull/296) now to address this. There is no guarantee that it is accepted though, as we're using an older version of the library because of PHP requirements.

wp i18n update-po is still a good workaround.

liedekef commented 2 months ago

Thanks, it sems you had more work with the test-files than the actual code change there :-) Also, I have another issue (php-files generated seem to include all strings, even those not translated, resulting in empty things on screen), but I'll report that in another ticket after some more testing.

swissspidy commented 2 months ago

Also, I have another issue (php-files generated seem to include all strings, even those not translated, resulting in empty things on screen), but I'll report that in another ticket after some more testing.

I thought we fixed this a while ago. Are you sure you're using the latest version of the command? Update to WP-CLI nightly just in case.

ernilambar commented 2 months ago
msgmerge -U events-made-easy-nl_BE.po new.pot

-U also seems to work. Using this did not ate blank line and entries were updated as expected.

liedekef commented 2 months ago

Also, I have another issue (php-files generated seem to include all strings, even those not translated, resulting in empty things on screen), but I'll report that in another ticket after some more testing.

I thought we fixed this a while ago. Are you sure you're using the latest version of the command? Update to WP-CLI nightly just in case.

I updated to nightly: that fixes this mentioned php-issue. So the released version still has the bug in it.

swissspidy commented 2 months ago

So the released version still has the bug in it.

Well, yes and no. A new version of this command was already tagged, so you can install it with wp package install git@github.com:wp-cli/i18n-command.git over what's bundled in WP-CLI. It's just that WP-CLI itself isn't updated that often. So it's either updating the single command or using the nightly.

-U also seems to work

That's strange. This option is "update def.po, do nothing if def.po already up to date". So maybe nothing was updated in your case?

ernilambar commented 2 months ago

That's strange. This option is "update def.po, do nothing if def.po already up to date". So maybe nothing was updated in your case?

PO file is being updated with new string from the main POT file. Example repo: https://github.com/ernilambar/langtest After command is run - https://github.com/ernilambar/langtest/pull/2

ernilambar commented 2 months ago

@liedekef I have created a POC pull request to use WPCLI i18n commands for updating PO files and regenerating mo files. You can take this as reference and modify according to your requirement. https://github.com/liedekef/events-made-easy/pull/553

liedekef commented 2 months ago

Well, I already have a script in the language subdir that does the same (without composer of course):

../wp-cli i18n make-pot . langs/events-made-easy.pot --skip-audit
../wp-cli i18n update-po langs/events-made-easy.pot
../wp-cli i18n make-mo langs/
#../wp-cli i18n make-php langs/

And since I currently need the nightly wp-cli build to fix a php-bug, I'll stay with my script for now :-)

swissspidy commented 2 months ago

Alright, so new versions of the Gettext library were released.

https://github.com/wp-cli/wp-cli-bundle/pull/639 updates the library for the bundle, which means after merge the next WP-CLI nightly build should have the new version.

Closing this as nothing needs to be done in this repo.