mozilla / fireplace

:fire: Frontend for the Firefox Marketplace
https://marketplace.firefox.com/
Other
116 stars 193 forks source link

Figure out L10n #24

Closed cvan closed 11 years ago

cvan commented 11 years ago

Do it.

cvan commented 11 years ago

@mattbasta what do you think of Jed?

mattbasta commented 11 years ago

Seems like a much wordier version of webL10n with fewer benefits. On Feb 18, 2013 12:08 PM, "Chris Van" notifications@github.com wrote:

@mattbasta https://github.com/mattbasta what do you think of Jedhttp://slexaxton.github.com/Jed/ ?

— Reply to this email directly or view it on GitHubhttps://github.com/mozilla/fireplace/issues/24#issuecomment-13740406.

almet commented 11 years ago

check out http://fabi1cazenave.github.com/webL10n/ (and that's by one guy at mozilla, from the gaia team).

mattbasta commented 11 years ago

That's what we're using now and there are a lot of shortcomings :-/ On Feb 19, 2013 9:43 AM, "Alexis Metaireau" notifications@github.com wrote:

check out http://fabi1cazenave.github.com/webL10n/ (and that's by one guy at mozilla, from the gaia team).

— Reply to this email directly or view it on GitHubhttps://github.com/mozilla/fireplace/issues/24#issuecomment-13786288.

fabi1cazenave commented 11 years ago

I’d be happy to address your suggestions regarding webL10n, feel free to open issues.

What are the main limitations for your use cases?

mattbasta commented 11 years ago

Hey Fabien, thanks for the offer. Here are some of the things that we're feeling now:

On the flip side, some things that we love about webL10n that make us really hesitant to use other solutions:

Let me know your thoughts. Right now, our tentative plan is to write a wrapper/precompiler that generates our locale files for us, but that's suboptimal as a long-term solution.

fabi1cazenave commented 11 years ago

Thanks for this detailed feedback. I’m afraid there are some points that I didn’t understand but here’s a first reply — hope it helps.

The data attribute system doesn't make concessions for localizing anything other than the inner text of the node. E.g.: there's no way to localize the placeholder="" attribute of an tag. Or at least as far as I've seen, there isn't.

Easy one:

<input type="text" data-l10n-id="searchField">
searchField.placeholder = Search…

Note: this works with all properties — including .innerHTML

It's not practical for us to manually declare IDs in the l10n.ini file for each localizable string and reference them by ID in the template. Ideally, we'd have a compilation step which plucks out the strings from our templates and builds the locale file for us, but that's a good deal of work that we'd like to avoid duplicating if at all possible.

I’m not sure to follow you here, so excuse me if I missed your point.

Continuing off the last point, .get() function doesn't provide a great way to expose pluralized strings (I.e.: ngettext) without using macros, which necessitates manually building the locale file.

Again, not sure to follow you here. Considering this example:

messageCount = You have {{count}} new message(s).
var str = _('messageCount', { count: 12 }

you get: You have 12 new message(s) — which will be sub-optimal when count is zero or one, but you can trust your l10n contributors to propose better strings, e.g.:

messageCount = {[ plural(count} ]}
messageCount[zero]  = No new messages.
messageCount[one]   = You have a new message.
messageCount[other] = You have {{count}} new message(s).

You can also provide a default value directly in the HTML file:

<p data-l10n-id="messageCount" data-l10n-args='{ "count": 12 }'></p>

This might be out-of-topic, as I didn’t understand the “…which necessitates manually building the locale file” part. :-/

Right now, our tentative plan is to write a wrapper/precompiler that generates our locale files for us, but that's suboptimal as a long-term solution.

Sorry if that’s a silly question, but what would your long-term solution look like?

fabi1cazenave commented 11 years ago

Maybe the misunderstanding comes from the fact that webL10n has only been tested on the client side, and you’re looking for a server-side solution?

mattbasta commented 11 years ago

I appreciate the response. Are you by any chance in MV? We could discuss this in person.

I'll respond inline:

Note: this works with all properties — including .innerHTML…

Excellent! Didn't know this existed. Are there docs for this? I must have missed it while reading through.

The downside here again, though, is that the actual text is outside of the template. We'd like to try, if we can, to make sure that we're not externalizing our localizable strings if possible. We've got a lot of text in the Marketplace and keeping it all in one central place (i.e.: the templates) is going to be pretty key.

extracting l10n data from an existing HTML file should be rather easy — @mathjazz told me in FOSDEM he’d propose a patch, which gave me a perfect excuse not to work on that.

Ideally we could do this without using IDs. It's not often possible to know where a string is going to land in the markup when it's generated.

templates: I’d love to make webL10n more template-friendly. So far I considered “porting” webL10n to existing template engines (e.g. handlebars.js), or even including a minimal template engine in webL10n (e.g. doT) — your opinion would be valuable on this.

We're using a branch of nunjucks for templates on the client side since we use jinja2 for templates on the server side. We use a lot of the features that they provide, and it's a pretty clean markup style. Keeping things as modular as possible would be best, IMHO, so I don't know if we'd be up for switching engines, but if webL10n could be made to be more nunjucks-friendly, we could definitely look into using that.

What do you mean, specifically, when you say "porting" webL10n?

pre-compilation: for Gaia we pre-localize our HTML documents using webL10n. We have some XPCshell magic for this (see our webapp-optimize.js script), it should be rather easy to make a Node.js version of it.

For templates that use the data attributes, that's a pretty easy process, it seems. What we're looking to do is extract the localizable strings from the calls to _('...'). When _() would get called, it would slugify the string and use that as the ID for a navigator.webL10n.get() call. More verbosely:

  1. _('App Ratings:') would call navigator.webL10n.get('app_ratings_')
  2. The "precompiler" would find the _('App Ratings:') call in the templates and JS and automatically add the appropriate line in l10n.ini:
app_ratings_ = App Ratings:

you get: You have 12 new message(s) — which will be sub-optimal when count is zero or one, but you can trust your l10n contributors to propose better strings, e.g.: ....

I love the ability for the localizers to provide pluralizations. But without the ability to provide the "base string" to the .get() function, we can't have functionality like this:

_ngettext('1 rating', '{n} ratings', rating_count)

Conceivably, this would just be a feature in our hypothetical precompiler. The above JS call would be expanded out to:

n_rating = {[ plural(n) ]}
n_rating[zero] = 0 ratings
n_rating[one] = 1 rating
n_rating[other] = {{n}} ratings

...and that would (hypothetically) happen automatically for us.

This might be out-of-topic, as I didn’t understand the “…which necessitates manually building the locale file” part. :-/

Basically, to take advantage of the pluralization macro as it exists today (without any modification to the webL10n library), we'd need to touch the l10n.ini file manually. Our ideal future would not require us to modify the locales file.

Sorry if that’s a silly question, but what would your long-term solution look like?

We're not sure yet. As it stands, we've got the equivalent of a "precompiler" running on zamboni right now (tower?). While this might be our inevitable future, we'd like to have a system that runs in parallel to development and is simply part of our build script to deploy.

Maybe the misunderstanding comes from the fact that webL10n has only been tested on the client side, and you’re looking for a server-side solution?

The templates are both server AND client side. We're planning a node.js rendering system that can build static pages when they're requested, but on the desktop side, the site will be rendered entirely by the same template files on the client side.

fabi1cazenave commented 11 years ago

Unfortunately I’m in France… feel free to invite me to MV. :) The doc is on the project page but obviously it’s unclear: https://github.com/fabi1cazenave/webL10n#quick-start

I’ll reply to the plural part in another message. I still need a bit of time to understand that part.

The templates are both server AND client side. We're planning a node.js rendering system that can build static pages when they're requested, but on the desktop side, the site will be rendered entirely by the same template files on the client side.

OK, that’s the part I missed. Looking in to jinja2 and nunjucks — I confess I don’t know them at all.

The downside here again, though, is that the actual text is outside of the template.

There are two possible approaches:

I don’t have any strong opinion on this; I prefer the former for the readability, but for FirefoxOS/Gaia our l10n teams and most developers who worked on this preferred the latter. Anyway, that’s your choice and both approaches should be supported.

If you want to have all the English text directly in the template, I guess you could define a list of the localizable attributes (e.g. title, placeholder…) and extract the text in those attributes exactly like you’d extract the text from an element.

What do you mean, specifically, when you say "porting" webL10n?

I’m not a template expert (bold understatement) but I think there’s no clear limit between a template engine and a localization system. So instead of running webL10n and a template engine (which is likely to cause two reflows instead of one on the client side in some situations), a better pattern would probably be to include webL10n directly in the template engine — (possibly as a external module if the template engine is extensible): ideally, developers could leave the localization to the template engine.

Again, I don’t know if that’d make sense — your opinion would be interesting. I had a quick look at handlebars.js because it’s very popular with MVC approaches and the syntax is very close to the one we use in webL10n, but I haven’t started to work on it.

fabi1cazenave commented 11 years ago

extracting l10n data from an existing HTML file should be rather easy — @mathjazz told me in FOSDEM he’d propose a patch, which gave me a perfect excuse not to work on that.

Ideally we could do this without using IDs. It's not often possible to know where a string is going to land in the markup when it's generated.

I agree that having to define data-l10n-id attributes everywhere is painful / repetitive, but I don’t see why it should be a problem for dynamically generated markup? Several HTML elements can have the same data-l10n-id.

mattbasta commented 11 years ago

I agree that having to define data-l10n-id attributes everywhere is painful / repetitive, but I don’t see why it should be a problem for dynamically generated markup? Several HTML elements can have the same data-l10n-id.

Consider the use case where we have localized text between two elements:

{{ _('Rating:') }} <span class="stars star-{{ star_count }}"></span>

The "Rating:" text doesn't actually have its own element.

Conceivably, this doesn't happen terribly often and in the cases where it does happen, we could work around it by adding elements. It would be ideal to not need this, though, and to continue using the .get() function (or a wrapped version of it) as we're currently doing instead.

fabi1cazenave commented 11 years ago

We have a few similar cases in Gaia, and webL10n has a magical, undocumented feature to handle them:

<p data-l10n-id="ratings">
  Ratings: <span class="stars star-3"></span>
</p>
[en-US]
ratings = Ratings:
[fr]
ratings = Évaluations :

When a target node has element children, webL10n translates the first non-empty text child (i.e. the “Ratings:” string in this case) and leaves the rest of the element alone.

mattbasta commented 11 years ago

Ah, interesting. I'll play around with that a bit.

On Wed, Feb 20, 2013 at 2:20 PM, Fabien Cazenave notifications@github.comwrote:

We have a few similar cases in Gaia, and webL10n has a magical, undocumented feature to handle them:

Ratings:

[en-US]ratings = Ratings:[fr]ratings = Évaluations :

When a target node has element children, webL10n translates the first non-empty text child (i.e. the “Ratings:” string in this case) and leaves the rest of the element alone.

— Reply to this email directly or view it on GitHubhttps://github.com/mozilla/fireplace/issues/24#issuecomment-13860399.

fabi1cazenave commented 11 years ago

BTW, your example illustrates perfectly why I think it’d be interesting to let the template engine handle the localization:

{{ _('Rating:') }} <span class="stars star-{{ star_count }}"></span>

This snippet makes perfect sense in a string-based template (be it server- or client-side), where all {{…}} occurrences are substituted before being parsed to DOM by the browser.

If you template engine transforms this snippet into the HTML fragment I suggested above:

<p data-l10n-id="ratings">
  Ratings: <span class="stars star-3"></span>
</p>

then everything will work. But if webL10n was “ported” to Nunjucks, we could rely on the template engine to return a localized fragment directly with the proper star_count value — which would get us rid of the need of a data-l10n-id attribute here.

You could also think of it like:

[*]
ratingHTML = {{ rating }} <span class="stars star-{{ star_count }}"></span>

[en-US]
rating = Rating:
template.rating = _('ratingHTML', { count: 3 });

In this example, ratingBlock would be common to all languages, {{ rating }} will be replaced by the localized string, and {{ count }} will be passed by the JS code. Some devs are using webL10n like this, which blurs the separation with a template engine.

I’m not sure this clarifies anything but it might give you ideas. :-) If you can think of any way to help webL10n play nice with existing template engines, I’d be very interested.

cvan commented 11 years ago

Thanks, everyone! Moving discussion to https://bugzilla.mozilla.org/show_bug.cgi?id=847661