projectfluent / fluent-rs

Rust implementation of Project Fluent
https://projectfluent.org
Apache License 2.0
1.09k stars 98 forks source link

Output looks the same, but is different #330

Closed sverro2 closed 11 months ago

sverro2 commented 1 year ago

I made a little test project with an issue I encountered when working on a project. It can be found on: https://github.com/sverro2/fluent-bug-test

For people who don't want to clone, I'll also add the code as a comment to this issue :)

The issue is, I have created two messages, which should produce identical outputs. They don't!

sverro2 commented 1 year ago

Leaving the code here to copy/look at:

use fluent::{FluentResource, FluentBundle, FluentArgs, FluentValue};
use unic_langid::LanguageIdentifier;

fn main() {
    let ftl_string = String::from(r#"
just-url-dynamic = content before link<a href="{$url}">this is a url</a>content after link
link-dynamic = content before link{$url}content after link"#);
    let res = FluentResource::try_new(ftl_string)
        .expect("Failed to parse an FTL string.");

    let langid_en: LanguageIdentifier = "en-US".parse().expect("Parsing failed");
    let mut bundle = FluentBundle::new(vec![langid_en]);

    bundle
        .add_resource(res)
        .expect("Failed to add FTL resources to the bundle.");

    // Testing what should be two identical ftl outputs strings.

    // Starting with a sentence with a url passed in. 
    let mut just_url_dynamic_arg = FluentArgs::new();
    just_url_dynamic_arg.set("url", FluentValue::from("https://projectfluent.org/"));

    let msg = bundle.get_message("just-url-dynamic")
        .expect("Message doesn't exist.");
    let mut errors = vec![];
    let pattern = msg.value().expect("Message has no value.");
    let just_url_dynamic_value = bundle.format_pattern(&pattern, Some(&just_url_dynamic_arg), &mut errors);

    // Then instead of passing in the url, pass in the entire html link.
    let mut link_dynamic_arg = FluentArgs::new();
    link_dynamic_arg.set("url", FluentValue::from(r#"<a href="https://projectfluent.org/">this is a url</a>"#));

    let msg = bundle.get_message("link-dynamic")
    .expect("Message doesn't exist.");
    let mut errors = vec![];
    let pattern = msg.value().expect("Message has no value.");
    let link_dynamic_value = bundle.format_pattern(&pattern, Some(&link_dynamic_arg), &mut errors);

    // both indeed look the same.
    println!("Compare these too, they are the same right?");
    println!("{link_dynamic_value}\n{just_url_dynamic_value}");
    println!("But are they? See for yourself: {}", link_dynamic_value == just_url_dynamic_value);
    println!("Just try to open both urls in your browser.");

    // They aren't the same. What does it actually look like?
    println!("\n\nByte arrays:\n{:?}\n{:?}\n\n", link_dynamic_value.as_bytes(), just_url_dynamic_value.as_bytes());

    // YUP they indeed are different!
}
sverro2 commented 1 year ago

This has to do with Unicode Isolation. I feel a bit stupid. But maybe the documentation about this can be a bit more clear/easier to find to prevent some other people from wasting time. For instance, in https://github.com/projectfluent/fluent.js/wiki/Unicode-Isolation (which took me some time to find, because I didn't know Unicode Isolation was the issue) it describes how to enable/disable the functionality, but it doesn't say that the URL in the Arabic translation would look great with isolation enabled, but won't actually work when trying to use it in an HTML link!

How did I get to know what the problem was? On the fluent-templates pages, all the way at the bottom (the FAQ section), I finally got the clue that confirmed what the real problem was: https://crates.io/crates/fluent-templates - The FAQ section.

zbraniecki commented 1 year ago

Hi!

As a person who recently encountered this issue, do you have a suggestion where this should be documented and how? I'd accept a PR with a suggestion