mathjazz / pontoon

In-place localization tool
https://pontoon.mozilla.org/
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

[pontoon] Unable to prefix translations with whitespace on .properties #1068

Open mathjazz opened 7 years ago

mathjazz commented 7 years ago

This issue was created automatically by a script.

Bug 1348336

Bug Reporter: @tomer CC: @ItielMaN, @mathjazz

Created attachment 8848568 Screenshot

Due to the .properties file format, it is impossible to submit strings that contain space(s) before or after the string, as these are omitted while submitting the translation.

Use case:

  1. The string 'h'/'m'/'s' are used to mark the remaining time for an operation. In some languages, it is more suitable to translate it as 'hours'/'minutes'/'seconds' instead of one-letter representation.
  2. The original string appear in the format of '%d%s'. For example '1h'/'2m'/'3s' because we want to use the full word instead of one character representation, we should translate these strings as ' hour'/' minute'/' second' (please note the space).
  3. .properties file format doesn't care about spaces before and after the string, and omit them, so the translation of ' hour' and 'hour' is actually the same, which makes it impossible to translate well.

Actual result: Currently it is possible to translate these strings to have a space before the actual translation. There may be a warning message which the translators should accept if it is indeed what they are intended to do, and that's all.

Being the scene, Pontoon will submit the string but then will import the file back into pontoon, and will find out that it is not the string it submitted, so the string will reappear in Pontoon as 'hour' and marked as imported.

Expected result: Such strings should not appear as imported in Pontoon. I think the best option would be to replace space with U+00A0 NBSP if the target file is .properties and the space is before or after the actual translation.

mathjazz commented 7 years ago

Comment Author: @tomer

(In reply to Tomer Cohen :tomer from comment #0)

Such strings should not appear as imported in Pontoon. I think the best option would be to replace space with U+00A0 NBSP if the target file is .properties and the space is before or after the actual translation. Even better approach: On export, replace spaces before and after the content with \u0020 (regular space), which will behave the same as regular space but is a valid character in .properties files. On import, replace \u0020 with the regular character, which won't cause creation of an imported translation record on Pontoon.

mathjazz commented 7 years ago

Comment Author: @tomer

The following JavaScript snippet should demonstrate my approach:

" Hello World ".replace(/^(\s+)/g, '\u0020').replace(/\s+$/g, '\u0020'); result: \u0020Hello World\u0020

mathjazz commented 7 years ago

Comment Author: @mathjazz

Thanks for reporting!

The trailing spaces seem to be working fine. If you add them, they will be saved in Pontoon, written to the file and no new translation (without the trailing spaces) will be imported.

I was able to reproduce the issue with leading spaces exactly as you described. If you want to fix the string you used as an example, you can go ahead and simply prepend \u0020.

As for the actual bugfix (add ability to use the leading whitespace and prevent superfluous string imports), I suggest we simply convert each leading whitespace to \u0020 after the translation is saved. There are several \u escape sequences used in our .properties files (both en-US and translations) and we display them unchanged in Pontoon UI. Let's stick to that habit.

This should be a pretty simple fix, as opposed to converting trailing whitespace to escape sequence and back upon file import/export, which would involve making changes to the unmaintained library we could get rid of in the L20n world.

Would that be sufficient?

mathjazz commented 7 years ago

Comment Author: @ItielMaN

Sounds good to me, as long as this process would be transparent to the translator.

mathjazz commented 7 years ago

Comment Author: @tomer

(In reply to Matjaz Horvat [:mathjazz] from comment #3)

If you want to fix the string you used as an example, you can go ahead and simply prepend \u0020. Pontoon jumps in and replace \0020 with an actual space, so it doesn't seems to work. It might be possible to workaround this with a double-escaping entity, but it is an hack that doesn't sounds good for a regular use.

Workaround: Adding ZWSP at the beginning of a string should be enough to workaround this issue. /^\0020/\200B\0020/ See https://hg.mozilla.org/l10n-central/he/rev/42e6782201a7