Closed yookoala closed 5 years ago
I agree with this approach. Symfony, which is used as the core for Drupal, uses this proposed functionality as well, but adds a second call to manage pluralisation.
public function trans($id, array $parameters = array(), $domain = null, $locale = null)
public function transChoice($id, $number, array $parameters = array(), $domain = null, $locale = null)
Both these methods translate a string ($id) and replace preset associated parameters ($parameters). The format of the translation allows for pluralisation based on the number ($number) supplied. Full details can be found at https://symfony.com/doc/current/translation.html As per ALL Symfony modules, these hold a MIT licence, and can be used independently of Symfony. Why bog yourself in framework code when you have such elegant solutions available?
Spent some times in studying text extraction with xgettext. Turns out it is very flexible.
Here is the gist I made to demonstrate the flexibility. https://gist.github.com/yookoala/64c6960f2c08dbf527481aae32391312
So it has very good support in extracting whatever function's parameters as singular, plural string and context (domain).
Thanks for putting this together Koala! It's very well thought out and researched.
Gibbon currently has 6324 translatable terms, and lots of __()
s. Our translation function needs to support the following:
__(string $text)
__(string $text, string $domain)
__(string $guid, string $text)
__(string $guid, string $text, string $domain)
Certainly, $guid is on the way out! But it's a big codebase, so until then we can’t make breaking changes. Any changes to __()
should:
I do think we have room to change the $domain
as the second parameter. I think when it was added it was likely based off of the Wordpress equivalent: https://developer.wordpress.org/reference/functions/__/
Right now, it looks like $domain
as the second parameter is currently only used in Free Learning and Data Admin. However, it has been published in the Module Development docs since v13 and should be supported, for at least a couple versions so it can be transitioned to a different method signature.
Gibbon’s __()
function handles gettext translation as well as the string replacement functionality. The Gibbon\Locale class currently does this by loading them from the database, caching them in the session, and applying them with str_replace
. Any changes need to account for this functionality.
I agree, using named parameters will be more useful to translators, and this is certainly the way modern frameworks now handle translation.
And right now, the API does not handle them at all.
Well, there are parameters in many of the translation strings to handle this, however they’re in the printf format of %s
and %1$s
. I think we can introduce named parameters, but will also have to keep these existing strings going forward too.
https://github.com/oscarotero/Gettext
I like that the gettext\gettext
library is lightweight and straight forward, without additional dependencies or framework-specific interfaces. Given that translation functions are called many times per page, I think keeping the complexity of these methods down is important.
However, I’m not fond of the number of different functions it uses, and their particular implementation of $domain
as the first parameter for domain-related methods. Other frameworks (symfony, laravel, wordpress, drupal, cake, etc) always keep the translatable string first, which I think is important for readability. If we did look into using this library, we would likely want to keep our own function signature, and just use the class methods of the library.
https://github.com/symfony/translation
It’s a pretty big library, that does many different things to solve many different problems. I’ve done some benchmarks to compare the performance and I’m seeing a 200ms increase on page loads using this library. For most pages, this is a 2-3x increase. Digging deeper into how it works, this difference appears to be in how the symfony Loader classes work, and for .mo
files it is reading & parsing the whole file for each page load. It looks like it works this way because it doesn’t have direct gettext support. I see where it’s a very versatile library for many other uses, but I think in this particular case because Gibbon uses gettext for translation, the built-in gettext functions have better performance.
What I do like is the way these modern frameworks are moving towards a single interface method/function __()
, t()
and trans()
vs having many optional methods (gettext\gettext in comparison has eight variations on __()
for domain, context and pluralization).
Their function signature uses an $options
array as the third parameter, which makes a lot of sense. This handles the added context, locale and domain options without needing either different functions for each (gettext) or additional parameters (symfony).
__()
function to support named parameters, while keeping backwards compatibility. This could be done with is_array checks for the second parameter, to allow us to transition from the original function signature to a new one without breaking everything.Some thoughts on a possible updated function signature:
__(string $original, array $args = [], array $options = [])
and a possible new pluralization function:
__n(string $singular, string $plural, $value, array $args = [], array $options = [])
(gettext’s n__
as a function name looks a bit wonky to me, vs __n
)
What do you think?
It's the end of the v17 cycle. I agree that we shouldn't do any drastic changes to core right now. Not unless it is 100% backward compatible. And the new __()
and __n()
approach seems reasonable.
Question: Do we need specific format for the named replacement keys?
For example,
PDO placeholder style (basically replace as-is)
__(`'Hello :name (:link)'`, [
':name' => htmlspecialchars('Alice'),
':link' => '<a href="https://twitter.com/alice">link</a>',
])
or
Quoted template variable style,
__(`'Hello ${name} (${link})'`, [
'name' => htmlspecialchars('Alice'),
'link' => '<a href="https://twitter.com/alice">link</a>',
])
or
Drupal style annotations (with extra handling to the variables),
__(`'Hello @name (!link)'`, [
'@name' => 'Alice', // will be processed with htmlspecialchars to remove.
'!link' => '<a href="https://twitter.com/alice">link</a>', // will not be processed.
])
Good question. I'm in favor of using something that wraps the placeholder like { }
s. It helps with strtr in cases where the string might show up as part of a series of characters, and I think its easier to read. That's just my 2 cents. What do you think? also @rossdotparker, any preference?
I am happy to leave it up to you two in this case. Thanks for all the work you are doing here.
To be honest, I'm more used to the Drupal style. I love it for providing extra string sanitization that we'd usually use (i.e. htmlspecialchar).
I'd like to know what @crayner think on this subject :-)
I don't use any particular style for placeholders, but use a variety of formats, as it is a simple stringreplacement, so stufff like %xxxxx% {xxxxx} or combinations are all valid %{xxxxxx}
I do use ONE defined placeholder: 'transChoice' (camel case word that is always an integer) and generates both a placeholder and a plurisation tool. The code I use is found in the Symfony translations library, and allows on a single line multiple definitions handling. If you use the Symfony code it expects a placeholder %count% to hold this number. e.g. "{0} There is no product|{1} There is one product|]1,19] There are %count% products|[20,Inf[ There are many products" So, jsut some ideas, on taking this to fuller conclusion. Need to go, but will find some code for you later...
Let's work with "Hello {placeholder}"
style now :-)
I've merged in #709 🎉 Thanks!!
This is a comment on the general design of translation API
__()
andGibbon\Locale::translate()
. The changes needed will have to be a long term one. So I'm just raising this issue here for discussion.Current Situation
Translations are done through the
__()
translation function infunctions.php
. It is undergoing a transition to remove$guid
. Before, it expects input like this:After the transition, it'll expect something like this:
The underlying translation is done by the global
$gibbon->locale
object, which is an instance ofGibbon\Locale
. The actual translation method looks like this:Problems of the Approach
In actual world, strings are not always translated as is. There will be parameters within a string for translation. And right now, the API does not handle them at all. Programmers would deal with it like this:
or
There are 2 problems:
In particular to problem (2), we can take a look again the 2nd string above. "Posted 10 minutes ago", when translated to Chinese would most reasonably be "在 10 分鐘之前發出". Please note that "在 ... 之前" means "ago", "發出" means "posted". Also there is no space between any of the words. So if you read carefully, the word order is totally different from English. There is no way the above code is going to translate to reasonable Chinese.
Existing Solution to the Problems
Drupal has a nice translation function
t()
that has a function signature like this:Drupal's t() function
It supports both named parameter and context (i.e. domain):
gettext/gettext
on PackagistEven better, "gettext/gettext" on packagist also support named parameter:
Proposal
Either to:
__()
andGibbon\Locale::translate()
to properly support the function.