The fake bidi method can produce output that even more closely resembles real
RTL text by adding an RLM before each RLO and after each PDF. For example,
where currently for "hello world" it produces "\u202Ehello\u202C
\u202Eworld\u202C", it would now produce "\u200F\u202Ehello\u202C\u200F
\u200F\u202Eworld\u202C\u200F".
While most of the time the visual output would be identical, adding the RLMs
has two advantages:
1. The first-strong directionality estimation method, as specified in the
Unicode Bidirectional Algorithm's rules P2 and P3
(http://www.unicode.org/reports/tr9/#P2), would then decide that fake bidi text
is RTL; currently it decides that it is LTR. As a result, fake bidi text
currently does not behave in the same way as real RTL text (e.g. Hebrew or
Arabic) in contexts like Android TextViews and HTML's dir="auto" attribute,
which use the first-strong algorithm. Adding the RLM would fix this discrepancy.
2. When a message contains a placeholder followed by a localizable text
fragment that begins with a strong character (not a neutral character like a
space or punctuation), and the placeholder ends in a number, the visual
ordering that currently results for fake bidi localization is not equivalent to
that resulting for a real RTL translation: in an RTL context, with fake bidi,
the number appears to the left of the text fragment; with real RTL text, the
number appears to the right. For example, let's say that the placeholder value
is "12" and the localizable text fragment is "hello". Then, when fake bidi
changes the "hello" into "\u202Ehello\u202C", the overall output is
"12\u202Ehello\u202C". You can see the visual ordering specified for that by
the Unicode Bidi Algorithm in an RTL paragraph here:
http://unicode.org/cldr/utility/bidi.jsp?a=12%E2%80%AEhello%E2%80%AC&p=RTL; the
number is on the left. However, if the text fragment were the Hebrew character
alef, "\u05D0", and thus the whole string were "12\u05D0", the number would
come out on the right:
http://unicode.org/cldr/utility/bidi.jsp?a=12%D7%90&p=RTL. This is fixed by
adding the RLMs to fake bidi: "12\u200F\u202Ehello\u202C\u200F" is displayed
with the number on the right, as with real RTL text
(http://unicode.org/cldr/utility/bidi.jsp?a=12%E2%80%8F%E2%80%AEhello%E2%80%AC%E
2%80%8F&p=RTL). The same issue occurs when a placeholder follows a localizable
text fragment that ends in a strong character; this is why I am suggesting not
only to put an RLM before the RLO, but also to put an RLM after the PDF. One
may think that it is strange to have a placeholder come immediately before or
after strong text, not a neutral like a space or punctuation; text like "hello:
12" or "12: hello" is a lot more common than "hello12" or "12hello". However,
the same issue occurs (and is fixed by the RLMs) when between the placeholder
and the localizable text fragment is a nonlocalizable text fragment containing
markup that introduces a space between the two, e.g. "<span style='padding:
5px'>", and this is unfortunately a fairly common practice in HTML.
Original issue reported on code.google.com by aha...@google.com on 7 Aug 2014 at 8:53
Original issue reported on code.google.com by
aha...@google.com
on 7 Aug 2014 at 8:53