@fluent/react – does/should <Localized> work with a text node as a direct child?

projectfluent / fluent.js

JavaScript implementation of Project Fluent

https://projectfluent.org/

Apache License 2.0

936 stars 77 forks source link

@fluent/react – does/should <Localized> work with a text node as a direct child? #498

Open elisehein opened 4 years ago

elisehein commented 4 years ago

Hello!

I didn't initially notice that all the examples in the wiki and the example project for @fluent/react use non-text nodes as direct children of <Localized>, so I was implementing most of my localized content using text nodes instead:

<Localized id="foo">Placeholder text for foo</Localized>

This stopped working when I made use of the elems option:

// jsx
<Localized
  id="foo"
  elems={{ score: <span className="score"></span> }}
  vars={{ score: 1 }}>
  Placeholder text for score
</Localized>

// ftl
foo = Here's a styled score: <score>{$score}</score>

The above outputs the string Here's a styled score: <score>1</score>

Wrapping the children of <Localized> in an element fixes it:

// jsx
<Localized
  id="foo"
  elems={{ score: <span className="score"></span> }}
  vars={{ score: 1 }}>
  <span>Placeholder text for score</span>
</Localized>

The above outputs the HTML Here's a styled score: 1, as expected.

Should I always ensure to include a HTML element inside <Localized>, or is this only an issue with elems? I also remember seeing an example somewhere with no placeholder text at all, just an empty tag: <Localized></Localized>, making me think the tag is indeed always required.

Is there a convention/stance on usage here?

Thanks!

stasm commented 4 years ago

Ah, good questions :) This looks like it's a mix of two related issues:

The docs for Localized are not clear about the two forms you can call it with. It's possible to wrap both an element, with or without content (<Localized></Localized>), or just use a string child (<Localized>Text</Localized> or even <Localized id="foo"/> if the fallback text is not required).
When the second (text-only) form is used, we short-circuit the markup parsing logic. That's why you're seeing <score>1</score> as text rather than a parsed .

For the first issue, I've been meaning to file an issue about separating Localized into LocalizedElement and LocalizedText (names TBD). We'd keep Localized for compatibility, too; it would dispatch to one of the specialized versions based on the type of the child. What do you think about it? I like the fact that this approach would document the expect usage through the names of the API themselves.

For the second issue, we could fix it in the current code because we return a React fragment anyways; we're free to add elements into it even if the original copy was a simple string. This would also be beneficial in the future when/if we allow translations to add markup even if elems doesn't have it. This can be useful for text-level semantic elements like  and  which are related to the writing style of the target language.

elisehein commented 4 years ago

Thanks, that clears it up for me!

I like the fact that this approach would document the expect usage through the names of the API themselves.

Yes, that sounds nice to me too!

For the second issue, we could fix it in the current code because we return a React fragment anyways; we're free to add elements into it even if the original copy was a simple string. This would also be beneficial in the future when/if we allow translations to add markup even if elems doesn't have it. This can be useful for text-level semantic elements like  and  which are related to the writing style of the target language.

That sounds good in theory, but I can imagine some confusion in combination with <LocalizedElement> and <LocalizedText>. For example, if I choose to use <LocalizedText>, is it reasonable to expect  and  tags added if they're specified in a translation? To me <LocalizedText> would sound like only plain text is allowed, so I wouldn't expect to suddenly see words in bold.

stasm commented 4 years ago

Hmm, good point about the expectation of LocalizedText. Is this something we can fix via naming? Would LocalizedFragment be a better name? Or LocalizeToFragment?

I understand that the other option to fix the confusion is to forbid HTML output from LocalizedText, but I'm hesitant to do it. One of the principles of FLuent is that localizers are in control. Even when the English copy doesn't use a plural form, the translation still can. I think the same rule should apply to text-level markup. (I specifically mean text-level because for functional markup, like <button> or <img>, we shouldn't allow it if it's not in elems.)

elisehein commented 4 years ago

Perhaps if the allowed markup includes only text-level tags, <LocalizedText> isn't as misleading?

Would that list of tags strictly include only plain tags without attributes, or would localizers also be able to make use of classes etc? I imagine if I wanted a fragment to include something like {$username}, that's functional enough that it should be specified in elems instead – but by allowing text-level markup, I guess you'd be making it possible for translations to unintentionally add all sorts of functionality.

Even if translators wouldn't be doing this deliberately (can't think of why they would 😬 😄 ), I think it'd still make me uneasy knowing developers have the option to add quick hacks and stuff on the Fluent level where things should really be addressed on the source code level.

But I do agree with the sentiment that text-level semantic elements should be possible regardless of the choice of <Localized> variant.

stasm commented 4 years ago

The way this works in @fluent/dom is: there's a list of allowed text-level elements and a list of allowed attributes. Anything else is sanitized and removed, unless it has a data-l10n-name attribute and a corresponding element with the same data-l10n-name is present in the HTML source. In such case, the whole "overlaying" mechanism kicks in, copying attributes from source onto the translation.

In React, I made a decision a long time ago to not use data-l10n-name and instead use the element name to the same effect. Hence <score> in your example. I'm not sure this was a right decision: there are some issues with elements which look like HTML to the DOM parser but aren't; other issues with void elements; and finally, the problem of portability of translations: a translation using with @fluent/dom can't be easily moved to a project using @fluent/react. This also has implications on localization tools where it would be easier to check the quality of translations if there was one canonical way for using markup. Fixing this is out of scope of this issue, but I wanted to provide a little bit of background :)

Also, @fluent/react doesn't sanitize markup the way @fluent/dom does. The implementation is pretty simple: look up an element of the same name in elems and insert the translation segment into it. Feature parity with @fluent/dom is also out of scope here, but I'd like to do it at some point in the future, which is why I'm careful when designing the API :)

stasm commented 4 years ago

How about this?

Let's introduce LocalizedElement which requires the child to be a React element (so, an HTML element of a React component). It returns a cloned child with the translation inserted into it.
Introduce LocalizedText which requries the child to be a simple string. It returns a React fragment with the translation.
Change Localized to dispatch to one of the above specialized components based on the type of the child.
LocalizedText ignores elems so that no functional markup can be used with it.
In the future, we might add support for translation-specific text-level elements to LocalizedText. Since it returns a React fragment, this won't be a breaking change in the API.

elisehein commented 4 years ago

That all sounds good to me!

Thank you for the background info, very interesting getting a look into how this API is evolving :)

stasm commented 4 years ago

A quick update: I have a draft PR open which fails a few tests currently. I'll try to fix them tomorrow and write some new tests. @elisehein would you be intersted in giving me feedback on the PR and reviewing it when it's ready?

elisehein commented 4 years ago

@stasm Sure!

joepie91 commented 3 years ago

Hi, just popping in to make a note that I ran into this same issue, in case more real-world examples are useful. It was rather surprising to encounter my markup as a literal string without any processing, for no clear reason!

A simplified version of my case is the following:

login-text-continue = To continue, please login below. If you don't have an account yet, you should <register>register</register> one!

<Localized id="login-text-continue" elems={{ register: <a href="/register" /> }}>
    (Placeholder text for instructing the user to login or register)
</Localized>

In this case, I'm trying to avoid duplicating localized strings into the source code, instead using the placeholder text as an instruction to the developer. Because the placeholder text does not contain markup, @fluent/react assumes that neither will the localized string, but that is false.

This should probably at least be explicitly documented ("make sure that you include some kind of HTML/React element in the placeholder string!"). Though a fix is better, of course :)

Edit: Actually, it turns out that replacing the placeholder text with text that contains inline markup, breaks too; with the <Localized/> expected to receive a single React node child error. I suppose a wrapper element is needed. <>a fragment like this</> seems to work.