tornadoweb / tornado

Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
http://www.tornadoweb.org/
Apache License 2.0
21.71k stars 5.5k forks source link

xgettext cannot extract strings from templates #622

Open igungor opened 11 years ago

igungor commented 11 years ago

Consider this example:

<input type="text" value={{ _("search term") }} />

I can extract the _() wrapped string with xgettext properly but when i translate and use tornado.locale.load_gettext_translations(), I get the translation of "search", not "search term".

PO file is like below:

: templates/index.html:1

msgid "search term" msgstr "terim ara"

But i only see "terim" in the rendered page, not "terim ara". The problem here is the original string is the value of an attribute and didn't escaped by double quotes, thus the returned string from load_gettext_translations() is not escaped and contains whitespace.

The returned string from the translated text should be quoted by default, otherwise all we get is the first word before the whitespace.

bdarnell commented 11 years ago

You mean you have <tag attr={{('string')}}> instead of <tag attr="{{('string')}}">? You're supposed to put the quotes in the html; the template system never provides them for you (why does it work for "search term" but not "terim ara", since both have spaces?)

igungor commented 11 years ago

If you put the quotes in the html for "input" tag, then its "value" attribute key's value can not be extracted. Try this example:

""" $ cat index.html

$ xgettext -L Python --keyword=_ index.html -d example -o example.pot

$ ls example.pot ls: a.pot: No such file or directory """

If you put quotes around your i18n wrapper, you can't extract it via xgettext. If you don't, you can extract the strings and "example.pot" template file is created but tornado can't get the translated strings properly from example.mo file as I wrote on the first comment. Tornado can't substitute the original string with the translated string as intended. Only the first word of the translation is substituted.

bdarnell commented 11 years ago

What's in index.html? That didn't come through in your example.

igungor commented 11 years ago

Sorry, markdown beat my html. here:

bdarnell commented 11 years ago

OK, so the problem is in how you're invoking xgettext. You're telling xgettext it's python, and parsed according to the rules of python the _() call appears in a string literal so it is correctly ignoring it. We need to either teach xgettext about tornado template syntax, or maybe just run the template compiler as a preprocessor. I'm not sure what sort of extensibility xgettext has so I'm not sure how to add this.

kinsen commented 10 years ago

i have same problem,too! Did you solved it ?

igungor commented 10 years ago

@kinsen sadly no.

st4lk commented 9 years ago

+1, also have same problem

st4lk commented 9 years ago

Here is how i've solved it. First of all, tornado template execute python code, so we simply can do:

{{ u'"{0}"'.format(_('search term')) }}

This will wrap with double quotes the output of _('search term')

It is ok for one time solution, but if there are many such strings, we can define special translate function, that will wrap the string with quotes:

class BaseHandler(tornado.web.RequestHandler):
    def get_template_namespace(self):
        ns = super(BaseHandler, self).get_template_namespace()
        ns['_Q'] = lambda *x: u'"{0}"'.format(ns['_'](*x))
        return ns

Now in template just use this _Q:

{{ _Q('search term') }}

And don't forget to add keywords into xgettext invocation, so it will find our new function:

xgettext [..old options] --keyword=_Q --keyword=_Q:1,2
fordguo commented 8 years ago

+1, what's the best way?

bdarnell commented 8 years ago

There are a couple of options. The simplest thing to implement would be a preprocessor that generates the python code from a template and writes it to disk, so you can run xgettext on the template compiler's output. However, this might be awkward to use.

Alternately, xgettext could be extended to recognize the Tornado template syntax. It doesn't look like xgettext has a plugin system, though, so this could require changes to xgettext itself. If you're willing to use babel instead of xgettext, it's easier: babel is written in python and has a plugin architecture, so we should be able to provide a plugin to extract strings from templates (by generating the code and passing it to babel's python extraction plugin.