projectfluent / fluent-rs

Rust implementation of Project Fluent
https://projectfluent.org
Apache License 2.0
1.07k stars 96 forks source link

Consider switching `FluentMessage::attributes` to be a vector #145

Closed zbraniecki closed 4 years ago

zbraniecki commented 4 years ago

Currently, we define FluentMessage as

#[derive(Debug, PartialEq)]
pub struct FluentMessage<'m> {
    pub value: Option<&'m ast::Pattern<'m>>,
    pub attributes: HashMap<&'m str, &'m ast::Pattern<'m>>,
}

which means that the order of attributes is not preserved from the FTL source because HashMap doesn't preserve key order.

We could solve it by switching it to:

#[derive(Debug, PartialEq)]
pub struct Attribute<'m> {
    pub name: &'m str,
    pub value: &'m ast::Pattern<'m>,
}

#[derive(Debug, PartialEq)]
pub struct FluentMessage<'m> {
    pub value: Option<&'m ast::Pattern<'m>>,
    pub attributes: Vec<Attribute<'m>>
}

@stasm - what do you think?

stasm commented 4 years ago

which means that the order of attributes is not preserved from the FTL source because HashMap doesn't preserve key order.

That's OK. The order of attributes in the source is significant only during parsing. When the AST is converted into a collection of FluentMessages (e.g. in add_resource or in get_message), the runtime controls what happens to attributes with name conflicts. This isn't specified right now; the de facto behavior in fluent.js is last wins. fluent-rs currently behaves the same way. I wouldn't change it until we discuss this in the context of the formatting specification.

zbraniecki commented 4 years ago

I encountered it here: https://searchfox.org/mozilla-central/source/intl/l10n/test/test_pseudo.js#58-59

For some reason (luck?) the JS implementation preserves the order and attributes[0] is always the first attribute. With Rust implementation that got lost and I had to switch to a helper function to get the right attribute out of the vector of attributes.

It feels like a papercut, but if intentional, I'm ok keeping it like that. Just wanted to run it by you

stasm commented 4 years ago

Oh, interesting! This isn't a FluentMessage :) It's an intermediate structure holding the resolved value and resolved attributes; see https://searchfox.org/mozilla-central/source/intl/l10n/Localization.jsm#565.

zbraniecki commented 4 years ago

That's also FluentMessage - https://github.com/projectfluent/fluent-rs/blob/master/fluent-bundle/src/bundle.rs#L27-L30 - the HashMap doesn't guarantee keys order. My question is if it would make sense to turn it into a Vec.

stasm commented 4 years ago

I'm not sure if I follow. The structure in the test you linked to isn't the FluentMessage returned by get_message. It's a different struct whose attributees are designed to match the shapes of DOM attributes: hence an array of {name, value} objects where value is a fully resolved string.

zbraniecki commented 4 years ago

Right, so let me walk you through my thinking.

Currently, to get to the line in test that I linked, we do https://searchfox.org/mozilla-central/source/intl/l10n/Localization.jsm#563 which takes the keys() a message stored in https://searchfox.org/mozilla-central/source/intl/l10n/Fluent.jsm#651 and created here https://searchfox.org/mozilla-central/source/intl/l10n/Fluent.jsm#877

Accidentally, this has been preserving order of attributes from the FTL source.

Now, when I switched to fluent-rs, I have FluentMessage which stores attributes as a HashMap and that does not guarantee the order of keys. In result this test started failing.

I noticed the discrepancy and reported it so that we can figure out if we want to make any guarantees here (and Rust should be fixed), or not (in which case it may be worth documenting the lack of guarantees on ordering).

stasm commented 4 years ago

I see, thanks for explaining. I think FluentMessage.attributes should be a HashMap. I'd fix the test by removing one attribute, so that there's only one.

zbraniecki commented 4 years ago

Thanks!