WeblateOrg / weblate

Web based localization tool with tight version control integration.
https://weblate.org/
GNU General Public License v3.0
4.63k stars 1.02k forks source link

XLIFF wrapped ICU MessageFormat checking not supported #9206

Open Chocobozzz opened 1 year ago

Chocobozzz commented 1 year ago

Describe the issue

The check throws an error because of an id attribute in tag

I already tried

Steps to reproduce the behavior

  1. Enable icu-message-format, icu-flags:xml flags
  2. Create We couldn't find any resource tied to the URL <x id="INTERPOLATION" equiv-text="{{ pathname }}"/> you were looking for. source string
  3. Translate it using Nepodařilo se nám najít žádný zdroj vázaný na hledanou URL <x id="INTERPOLATION" equiv-text="{{ pathname }}"/>.

The check throws an error where it shouldn't as the translations are valid.

Expected behavior

No check error

Screenshots

screen_2023-05-05-15:31:58

Exception traceback

No response

How do you run Weblate?

Other

Weblate versions

4.17

Weblate deploy checks

No response

Additional context

No response

nijel commented 1 year ago

Is ICU MessageFormat really what you use? As mentioned in https://github.com/Chocobozzz/PeerTube/issues/5781, the same string does not work in other tool. I'd suggest removing icu-flags:xml, icu-message-format from flags.

Chocobozzz commented 1 year ago

Is ICU MessageFormat really what you use?

Yes we do using Angular i18n extractor. The string from the linked issue might be an error we should fix, it's the reason why I chose another string that should be valid. If the string is not in ICU MessageFormat, the checker should not throw an error.

I think the root cause is the checker doesn't support placeables with attributes in the string, and so many other ICU tools.

You can find a valid ICU message on https://weblate.framasoft.org/translate/peertube/angular/fr_FR/?checksum=1434c4e7ec5ffccd If <x .../> is removed the check passes. So does if we just remove the id attribute.

May be related to the way the dependency parses the placeholder, where tag attributes are not supported? https://github.com/SirStendec/pyicumessageformat/blob/770de136b9b526c0baaf889d023c115999962430/pyicumessageformat/parser.py#L281

nijel commented 1 year ago

If the string is not in ICU MessageFormat, the checker should not throw an error.

The checker is there to throw an error if the string is not valid. If you don't want it, don't enable it.

I think the root cause is the checker doesn't support placeables with attributes in the string, and so many other ICU tools.

Yes, attributes do not seem to be supported on XML tags by https://pypi.org/project/pyicumessageformat/ (what Weblate uses for handling this). I have no clue whether it is valid to use them or not.

Why are you actually using XML placeholders in addition to the ones in ICU MessageFormat?

Chocobozzz commented 1 year ago

The checker is there to throw an error if the string is not valid. If you don't want it, don't enable it.

So if I understand well, {count, plural, =1 {Blocked <x id="videoName"/>.} other {Blocked <x id="count"/> videos.}} is not a valid ICU string?

Why are you actually using XML placeholders in addition to the ones in ICU MessageFormat?

We extract strings using Angular i18n tools so I'm not 100% sure I understand your question correctly, but yes I think. You can find an example on such extracted strings in the official guide https://angular.io/guide/i18n-example in the src/locale/messages.fr.xlf

...
<trans-unit id="5a134dee893586d02bffc9611056b9cadf9abfad" datatype="html">
  <source>{VAR_PLURAL, plural, =0 {just now} =1 {one minute ago} other {<x id="INTERPOLATION" equiv-text="{{minutes}}"/> minutes ago} }</source>
  <target>{VAR_PLURAL, plural, =0 {à l'instant} =1 {il y a une minute} other {il y a <x id="INTERPOLATION" equiv-text="{{minutes}}"/> minutes} }</target>
</trans-unit>
...

See also https://angular.io/guide/i18n-common-translation-files#minute-plural-example

github-actions[bot] commented 1 year ago

This issue looks more like a support question than an issue. We strive to answer these reasonably fast, but purchasing the support subscription is not only more responsible and faster for your business but also makes Weblate stronger.

In case your question is already answered, making a donation is the right way to say thank you!

nijel commented 1 year ago

So if I understand well, {count, plural, =1 {Blocked <x id="videoName"/>.} other {Blocked <x id="count"/> videos.}} is not a valid ICU string?

I have no clue, but pyicumessageformat tells it is not – the problem it reports are attributes on the XML tag.

nijel commented 1 year ago

Thinking more about that, Angular i18n extractor wraps ICU placeholder in XLIFF placeholder, so that editing the file as XLIFF works better with native tool.

That causes problem for validating ICU MessageFormat in Weblate because the string is valid ICU MessageFormat only after stripping XLIFF markup (and replacing it with equiv-text attribute).

github-actions[bot] commented 1 year ago

This issue has been added to the backlog. It is not scheduled on the Weblate roadmap, but it eventually might be implemented.

In case you need this feature soon, please consider helping or push it by funding the development.