Closed JohnnyJayJay closed 4 years ago
What you encountered is a method of handling bidirectionality documented here: https://github.com/projectfluent/fluent/wiki/BiDi-in-Fluent
The reason you encountered is something we argue against - tests should not test against any particular full output of a formatter (so, output.contains("John")
is ok, but assert_eq!(output, "Hello John")
is not) - https://firefox-source-docs.mozilla.org/l10n/fluent/tutorial.html#testing
If you really need to turn it off for testing, there's - https://github.com/projectfluent/fluent-rs/blob/master/fluent-bundle/src/bundle.rs#L364
Okay, that makes perfect sense. I wasn't aware of this behaviour. Implemented a binding for it straight away and it works. Thanks a bunch!
Can this be reopened? It seems to be the default behaviour for any use of fluent
, including through upstream tools like i18n-embed
. Over the past few days I've been investing this, and my findings are summarized here: https://github.com/kellpossible/cargo-i18n/issues/44
In essence, the presence of the FSI/PDI characters is detected by the terminal and appears like this:
This occurs regardless of which shell or terminal program I test it in.
The link given in @zbraniecki 's last post points to a code line on the master
branch which has almost certainly changed since April. What was it supposed to point to?
Thanks for any help you can offer, fluent is very exciting to me and I plan to use it widely.
Can this be reopened?
What's the purpose of reopening it?
It seems to be the default behaviour for any use of fluent
Correct. And it it the intended default behavior with accordance to internationalization best practices https://github.com/projectfluent/fluent/wiki/BiDi-in-Fluent
The issue you are reporting is of:
i18n-embed
doesn't allow for turning off bidi isolation -> feature-request against themWhat would you like to add to fluent-rs?
Thanks for getting back to me so quickly. My motivation for reopening was that to the layman who isn't familiar with the low-level intricacies of localization, the output appeared to be a bug. I'll accept the answer that it isn't, but perhaps the visibility of this issue/fix could be advertised somewhere, like a README?
For now, where was the link you posted to @JohnnyJayJay about the bundle tweak supposed to point to? I'll try to test what he tested, and then make a suggestion upstream to i18n-embed
.
Ah ha, it must have been pub fn set_use_isolating(&mut self, value: bool)
. Testing now.
EDIT: This indeed fixes the issue:
bundle.set_use_isolating(false);
I understand the advertised concern and use case, namely:
This is important for cases such as when a right-to-left user name is presented in the left-to-right message.
In the case of my application, I'm in complete control of all strings and localization targets, so there's no need for the bidirectionality. It should be safe to turn it off completely then, yeah?
My motivation for reopening was that to the layman who isn't familiar with the low-level intricacies of localization, the output appeared to be a bug.
It is. In the terminal that doesn't support FSI/PDI :) Or in the fact that you're trying to analyze output of an Intl API which is a no-go (output of internationalization is to be treated as opaque).
Ah ha, it must have been pub fn set_use_isolating(&mut self, value: bool).
https://docs.rs/fluent/0.13.1/fluent/bundle/struct.FluentBundleBase.html#method.set_use_isolating
In the case of my application, I'm in complete control of all strings and localization targets, so there's no need for the bidirectionality. It should be safe to turn it off completely then, yeah?
Maybe. Here are some of the scenarios where it is not:
I hope you can start drawing your own ideas on where it may come to play.
As for docs - I'm happy to see an issue opened on adding better docs, improving API docs, and adding README paragraph on BiDi. I'm aware that a lot of users do not realize that challenge when they start including localization systems and also that Fluent is the first to correctly use FSI/PDI to allow bidi text flow.
Thank you for this, it's a great help. I'm sure others will wander in here from the future and be put at ease by the explanation.
Out of curiosity, what terminal do you use? I was noticing the issue in both xterm
and urxvt
, both popular terminals, so it seems some fixes are necessary there. Otherwise, I'm all for standards, and it's not surprising that we'd experience friction when a new standard arises. I hope fluent does well! I am adopting it for my main FOSS project.
Out of curiosity, what terminal do you use?
I use nushell
and zsh
, but also tested in bash
. All three handle Unicode properly.
Thanks. I think it must be the terminal and not the shell, since I was unable to get it to work in fish
nor bash
via urxvt
, xterm
, or alacritty
. And for all I know, sysadmins are using my tool to manage systems remotely via TERM=linux
, so I can't make any assumptions there, or tell them what terminal to use. For now I've upstreamed the isolation mark disabling to cargo-i18n
.
Thank you for your help, this was a productive development.
I encountered this issue while writing tests for my Java bindings. I noticed that the return value of
format_pattern
adds additional characters before and after each argument. At first I thought this may be caused by the conversions between Java and Rust strings, but upon further investigation, this also happens in the simplest and purest possible Rust scenario.This is the entire code:
Expected output:
Actual output:
The values look the same, but there are some zero-width characters before and after "test" and 3.141 respectively. You can see for yourself by pasting them into some string comparison tool.
The characters change the output on devices that are able to display them, for example here in a bash terminal:![Terminal output](https://i.imgur.com/UyQPUp6.png)
For this pure test, I used
fluent
version0.11.0
, but it also happens directly withfluent-bundle
. I'm running on Windows 10 64 Bit.