Open janishorsts opened 8 months ago
So two ways that I see:
All ways assume, that we will have Executor e.g. Formatter, which will be able to format message with given variables (mf2 string -> formatted string
). Which is not implemented yet.
Otherwise I see no point of using mf2.
Continue, the existing way. All variables are extracted from source text and put into local declaration.
Javascript example:
Hello {{ name }}! // source text
.local $name = { |{{ name }}| }\n {{Hello { $name }!}} // mf2 text
I prefer this way because:
Example of 1:
// msg = .local $name = { |{{ name }}| }\n {{Hello { $name }!}}
// Back to source formatting (use local declaration values)
msg.Format() // Hello {{ name }}!
// Formatting with given variables (override local declaration with new values)
msg.Format("name", "John") // Hello John!
Example of 2:
// input po
/*
#, python-format
msgid "Hello %(name)s!"
msgstr ""
*/
// Flag is important, because it tells that this message contains placeholders.
// But in our model.Message we cannot store it anywhere. Resulting in crucial information loss.
// But when we store as local declaration, we do not lose it.
// mf2
/*
.local $format = { python-format }
.local $name = { |%(name)s| }
{{Hello { $name }!}}
*/
// And now we can safely return it back to po format
// output po
/*
#, python-format
msgid "Hello %(name)s!"
msgstr ""
*/
Still I think it is abusing MF2 syntax, and making it do things it was not designed for...
Revert, to storing variables in function options.
Javascript example:
Hello {{ name }}! // source text
Hello { $name :fmt format=|{{ }}| }! // mf2 text
// or
Hello { :fmt format=|{{ name }}| }! // mf2 text
// or any other similar option
Why I do not like this way:
More details about 1. Formatting is when we convert mf2 string, to plain translated string, as was demonstrated in previous way example 1. But in our case, we need to format it back to the same string as it was when extracting, e.g. same placeholders if any. Which is not MF2 task!
That means
// msg = Hello { $name :fmt format=|{{ }}| }!
// theoretically this should error, because variable name was not provided
msg.Format()
// here means, that "John" before adding it to resulting string should be
// formatted with function fmt, with option format=|{{ }}|, which is not what we want.
msg.Format("name", "John")
At this point, I am not sure, if we are using MF2 correctly, or not, but I am sure that we dug ourselves into a hole, and need to think about it before continuing.
WIP.
Named formatting
Using custom function fmt
to store details of the value from the original source code.
const Intl = require('messageformat');
const { string } = require('messageformat/functions');
const locale = 'en-US';
const msg = `.input { $name :fmt lang=python variant=|%s| }
.match { $count :integer }
1 {{Hello, { $name }! You got 1 apple }}
* {{Hello, { $name }! You got { $count } apples }}`;
const mf = new Intl.MessageFormat(msg, locale, {
functions: {
"fmt": (context, options, input) => {
console.log(options)
return string(context, options, `${input} ${o}`)
}
},
});
// [Object: null prototype] { lang: 'python', variant: '%s' }
// Hello, Dave! You got 1 apple
console.log(mf.format({name:'Dave', count: 1}))
// [Object: null prototype] { lang: 'python', variant: '%s' }
// Hello, Lisa! You got 2 apples
console.log(mf.format({name:'Lisa', count: 2}))
We use count
for plural in Message Format 2.
However, there is a risk that count
is used as a named formatted variant in the original source code.
Add namespace, and rename it to n
. E.g. ota:n
. We should namespace all variables that refer to the translate agent variables.
Example from https://docs.oasis-open.org/xliff/v1.2/xliff-profile-html/xliff-profile-html-1.2-cd02.html#General_EntityReferences
<h1>Online Help for &ProductName;</h1>
<source>Online Help for <ph id='1'>&ProductName;</ph>.</source>
Online Help for {{ $ProductName :fmt style=|% ;| }}
-- OR --
Online Help for {{ $ProductName :fmt prefix=|%| suffix=|;| }}
Nested HTML code
<p title='Information about Mount Hood'>This is Mount Hood: <img src="mthood.jpg" alt="Mount Hood with its snow-covered top"></p>
<ph id="a_2">
<sub ctype="x-html-p-title">Information about Mount Hood</sub>
</ph>This is Mount Hood:<ph id="a_3" ctype="x-html-img" xhtml:src="mthood.jpg">
<sub ctype="x-html-img-alt">Mount Hood with its snow-covered top
</sub>
</ph>
I would say we do NOT support this, initially. These days, translation libraries work differently. The above example is SO FRAGILE, any change to HTML requires a corresponding fix in all translations.
E.g in the instance of Angular app. It would create three text copies to be translated:
<!-- older angular -->
<p [title]="'Information about Mount Hood' | translate" translate>This is Mount Hood: <img src="mthood.jpg" [alt]="'Mount Hood with its snow-covered top' | translate"></p>
<!-- latest angular -->
<p title="Information about Mount Hood" i18n-title i18n>This is Mount Hood: <img src="mthood.jpg" alt="Mount Hood with its snow-covered top" i18n-alt></p>
The latest Angular uses ICU a lot now.
Depending on the presence or absence of a variable or literal operand and a function, private-use annotation, or reserved annotation, the resolved value of the expression is determined as follows: If the expression contains a reserved annotation, an Unsupported Expression error is emitted and a fallback value is used as the resolved value of the expression. Else, if the expression contains a private-use annotation, its resolved value is defined according to the implementation's specification.
That means that we could leverage the private-use annotation to store the original value of the variable. E.g. Implementation of Formatter:
Example:
mf2 := NewMF2(`
.input = { $placeholder ^original=%s}
.input = { $name ^original=\{\{ name \}\}}
{{Hello { $placeholder } { $name }}}
`)
// Treat not existing keys in input map as a signal to use the original value
mf2.Format() // Result: Hello %s {{ name}}
// Use input map to resolve one expression
mf2.Format(map[string]string{"name": "John"}) // Result: Hello %s John
// Use input map to resolve both expressions
mf2.Format(map[string]string{"placeholder": "World", "name": "John"}) // Result: Hello World John
Example when converting from xliff:
// source: <source>Online Help for <ph id='1'>&ProductName;</ph>.</source>
// mf2
. input = { $ph1 ^original=&ProductName }
{{Online Help for { $ph1 }}.}
It should be equivalent to your proposed solution, but (IMHO) cleaner + follows the formatting guide of mf2 (hopefully).
<!-- latest angular -->
<span i18n>The author is {gender, select, male {male} female {female} other {other}}</span>
It produces the following XLIFF1.2
<trans-unit id="3560311772637911677" datatype="html">
<source>The author is <x id="ICU" equiv-text="{gender, select, male {male} female {female} other {other}}" xid="7670372064920373295"/></source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/app.component.html</context>
<context context-type="linenumber">339,341</context>
</context-group>
</trans-unit>
<trans-unit id="7670372064920373295" datatype="html">
<source>{VAR_SELECT, select, male {male} female {female} other {other}}</source>
<context-group purpose="location">
<context context-type="sourcefile">src/app/app.component.html</context>
<context context-type="linenumber">339,340</context>
</context-group>
</trans-unit>
The MF2 transformation challenges:
This ticket is for discussion on how to handle placeholders and convert them to and from Message Format 2.
The primary goal is: