jdm / asknot

Ask not what Mozilla can do for you but what you can do for Mozilla.
http://whatcanidoformozilla.org
Other
252 stars 210 forks source link

Problem with using some HTML tags in locale file #125

Open aleksandrs-ledovskis opened 10 years ago

aleksandrs-ledovskis commented 10 years ago

Probably is follow up to gh-86.

Steps to reproduce:

  1. Visit any C language sub-page (e.g. this one)
  2. Switch to Italian locale - "it". Take notice, that "POO" string has additional text visible onhover. It uses <abbr> HTML tag for this purpose.
  3. Switch back to English locale. The string "POO" is still be present, placed somewhere at the end of the title.

This artifact most likely appears due to webL10n library's ignorance of non-TEXT_NODE DOM nodes (see source code)

jdm commented 10 years ago

Is this something we should file for webL10n instead?

aleksandrs-ledovskis commented 10 years ago

Probably, but this restriction was probably made deliberately. Better to ask @fabi1cazenave

fabi1cazenave commented 10 years ago

That’s because this string:

<span style="display: block;" class="question" data-l10n-id="c-intro">
  So you think OOP is for hipsters?
  That’s cool, we all get nostalgic sometimes.
  You could work on
</span>

is translated to this:

<span style="display: block;" class="question" data-l10n-id="c-intro">
  Quindi trovi che la <abbr title="Programmazione orientata agli oggetti">POO</abbr>
  sia per gli <span lang="en">hipster</span>?
  Bene, capita a tutti un po' di nostalgia. 
  Ti propongo
</span>

because the Italian translation has been done like this:

c-intro.innerHTML = Quindi trovi che la <abbr title="Programmazione orientata agli oggetti">POO</abbr> sia per gli <span lang="en">hipster</span>? Bene, capita a tutti un po' di nostalgia. Ti propongo

The problem is that the Italian translation uses .innerHTML for almost every string, whereas the default property that is localized by webL10n is .textContent: when the text is displayed in Italian, it creates new nodes (e.g. <abbr>) that are not internationalized — which means they’re not changed when switching back to English.

That’s the reason why locales should use .innerHTML only for the strings that use .innerHTML in English.

Note that you could allow l10n contributors to use .innerHTML whenever they want by doing an `s/textContent/innerHTML/ in this line, but that wouldn’t be as safe as .textContent obviously.

Note also that webL10n doesn’t reset a node content by default in order to preserve some edge cases like this one, where we want to keep the <span> element unchanged:

  <li data-l10n-id="wifi-level">
    Signal strength <span class="wifi3"></span> 
  </li>

This is handy in a lot of cases but the major drawback is that it doesn’t allow localizers to use .innerHTML when only the default .textContent property was intended. I’ve hesitated a lot before implementing this behavior, if you think it doesn’t make sense please open an issue on the webL10n repo to discuss it.

Last but not least: in the real world you probably don’t care much unless you leave the language selector in the production page (which I doubt). ;-)