ximion / appstream

Tools and libraries to work with AppStream metadata
http://www.freedesktop.org/wiki/Distributions/AppStream/
GNU Lesser General Public License v2.1
210 stars 115 forks source link

Weblate: Set custom placeholder in the translation flags #605

Closed rffontenelle closed 6 months ago

rffontenelle commented 7 months ago

There are hundreds of occurrences of literal texts in the translation strings e.g. `content_attribute` or `<url/>`. Setting placeholder for these occurrences will increase the chances of translation consistency as these text should not be translated, but the translation might not notice that. E.g. I've spotted some pt_BR translation strings incorrectly translated.

If the translator do translate the text that was supposed to be kept as the original in English, a translation check error will be displayed. While it can be dismissed, it will bring to the attention of the translation the fact of a wrong translation of these texts.

I was working on matching every occurrence of these texts, and I came up with the following regexp:

placeholders:r"`[\w@_\-:=/<>\.%]*`"

To list strings with `literal`:

wget https://hosted.weblate.org/download/appstream/translations/en/ -O appstream.pot
msgcat --no-wrap appstream.pot > tmp; mv tmp appstream.pot
grep '^msgid ' appstream.pot | sed 's|^msgid "||;s|"$||;/^$/d' > strings.txt
grep -P '`[\w@_\-:=/<>\.%]*`' strings.txt 
# prints source strings (msgid) containing literal texts

How to set:

  1. Go to https://hosted.weblate.org/settings/appstream/translations/#translation
  2. In Translation Flags. set the above string.

NOTE: Weblate already automatically applied placeholders for C format e.g. %s, and these tags e.g. <id/>, so IMHO only these literals worth setting placeholders.

ximion commented 6 months ago

I've applied it, and we now have a lot of failing checks because many translators didn't copy the backticks verbatim... Which I think is a fair thing to do... However I also think validating these literals is a very good idea, and I wasn't aware that this feature existed!

I'm leaving this applied for a few days, I wonder if this will cause any issues... Since the check can be dismissed, I guess it should probably be okay.

rffontenelle commented 6 months ago

I personally was able to solve some replaced backticks and some translated (when shouldn't) probably because of machine translation used and not reviewed. So, yeah, this placeholder is very useful as expected.

ximion commented 6 months ago

Backticks are fortunately also used exclusively for literals in all texts in AppStream, which makes this change fairly easy :-)

Thank you for bringing attention to this and even crafting an expression that works well!

rffontenelle commented 6 months ago

Thank you for bringing attention to this and even crafting an expression that works well!

You're welcome. I'd be more than happy to know of any improvement to this expression. Sadly the global .* could match a bunch of text of what would be one verbatim content to another, reason for this ugly regexp.