<p dir="ltr">
The title is <span dir="rtl">אבג <span dir="ltr">C++</span> דהו</span> in Hebrew.
</p>
This is a Latin paragraph containing a (faux) Hebrew book title that itself contains the Latin name "C++". The title as a whole should render right-to-left, with C++ rendering left-to-right. That is, it should render like this:
Without the spans, i.e.
<p dir="ltr">
The title is אבג C++ דהו in Hebrew.
</p>
this would render the title as 3 independent runs, resulting in the incorrect
The spans map directly to Right-to-Left Isolate (RLI, U+2067), Left-to-Right Isolate (LRI, U+2066), and Pop Directional Isolate (PDI, U+2069). As a Go string, this is
"The title is \u2067אבג \u2066C++\u2069 דהו\u2069 in Hebrew."
which I call the "annotated" version of the plain string
"The title is אבג C++ דהו in Hebrew."
However, when I run the following code that uses the bidi package, both the plain and the annotated string result in the same, incorrect visual order:
package main
import (
"fmt"
"log"
"golang.org/x/text/unicode/bidi"
)
func main() {
plain := "The title is אבג C++ דהו in Hebrew."
// This uses RLI, LRI, and PDI to achieve the equivalent to
// The title is <span dir="rtl">אבג <span dir="ltr">C++</span> דהו</span> in Hebrew.
annotated := "The title is \u2067אבג \u2066C++\u2069 דהו\u2069 in Hebrew."
for _, s := range []string{plain, annotated} {
var p bidi.Paragraph
p.SetString(s, bidi.DefaultDirection(bidi.LeftToRight))
ord, err := p.Order()
if err != nil {
log.Fatal(err)
}
for i := range ord.NumRuns() {
run := ord.Run(i)
fmt.Printf("%d %d %q\n", i, run.Direction(), run.String())
}
fmt.Println()
}
}
0 0 "The title is "
1 1 "אבג"
2 0 " C++ "
3 1 "דהו"
4 0 " in Hebrew."
0 0 "The title is \u2067"
1 1 "אבג \u2066"
2 0 "C++"
3 1 "\u2069 דהו"
4 0 "\u2069 in Hebrew."
bidi.go has the following comment:
// This API tries to avoid dealing with embedding levels for now. Under the hood
// these will be computed, but the question is to which extent the user should
// know they exist. We should at some point allow the user to specify an
// embedding hierarchy, though.
but I'd still expect the computed visual order to be correct with respect to the embedding levels, even if the levels themselves aren't exposed to the user.
I've confirmed with Firefox and Chrome that my use of RLI/LRI/PDI produces the expected rendering that is identical to the one using spans.
(Take special care when reading this issue in a browser that handles right-to-left text, the strings in the code samples and output will be displayed in visual order, not logical order. I've attached all code as an archive to avoid confusion. For Emacs users, (setq bidi-display-reordering nil) is a handy way of disabling reordering to be able to inspect file contents in logical order.)
Consider the following bit of HTML:
This is a Latin paragraph containing a (faux) Hebrew book title that itself contains the Latin name "C++". The title as a whole should render right-to-left, with C++ rendering left-to-right. That is, it should render like this:
Without the spans, i.e.
this would render the title as 3 independent runs, resulting in the incorrect
The spans map directly to Right-to-Left Isolate (RLI, U+2067), Left-to-Right Isolate (LRI, U+2066), and Pop Directional Isolate (PDI, U+2069). As a Go string, this is
which I call the "annotated" version of the plain string
However, when I run the following code that uses the
bidi
package, both the plain and the annotated string result in the same, incorrect visual order:bidi.go
has the following comment:but I'd still expect the computed visual order to be correct with respect to the embedding levels, even if the levels themselves aren't exposed to the user.
I've confirmed with Firefox and Chrome that my use of RLI/LRI/PDI produces the expected rendering that is identical to the one using spans.
(Take special care when reading this issue in a browser that handles right-to-left text, the strings in the code samples and output will be displayed in visual order, not logical order. I've attached all code as an archive to avoid confusion. For Emacs users,
(setq bidi-display-reordering nil)
is a handy way of disabling reordering to be able to inspect file contents in logical order.)bidi.tar.gz