Open bobbingwide opened 3 years ago
Proposal for extracting rich text.
- for each outer rich text tag found
if it has inner tags
extract text using the rich text route
else
extract translatable attributes and inner tags recursively (current solution )
rich text route - extract
- copy tag and inner tags to new DOMdocument
- save as HTML
- strip outer tag ( and attributes )
- add as the string to be translated
Proposal for localization
- for each outer rich text tag
- if it has inner tags
apply translations using the rich text route
else
apply as per current solution
rich text route - localize
- convert translation to DOMdocument
- replace existing inner nodes with translated content
Q. Should we use the rich text route for each translation?
Couple of things to fix.
strong
to the list of acceptable rich text tags.\r
) and line feeds (\n
) from strings to be translated.The last problem was satisfied by changing the for loop. From
foreach ( $node->childNodes as $child_node ) {
...
$this->extract_strings( $child_node );
}
To
for ( $currentNode = 0; $currentNode < $node->childNodes->length; $currentNode++ ) {
...
$this->extract_strings( $node->childNodes[$currentNode] );
}
It seems that the replaceChild
method performed in DOM_string_updater:::replace_node()
messed up the current position in the foreach loop.
In issue #7 I demonstrated that it's possible to extract text from the Full Site Editing template and template part
.html
files. But we noted that this solution suffered from the same problem as for PHP code in that translators didn't get full sentences to translate.I wrote about the challenges of translating rich text content in Localization of Full Site Editing themes.
Now I want to see if I can implement some of the proposals in that post.
The automatic translations to UK English (
en_GB
) and bbboing (bb_BB
) work because the translation process doesn’t attempt to make any sense of the content to be translated.In my opinion, before we can finalize any solution for localizing the HTML we’ll have to agree some ground rules for internationalizing and extraction. The main problem areas that I have considered are:
Items 4., 5., and 6. can be supported by providing special tools in the block editor.
Multi Lingual Support is Phase 4 of Gutenberg, so we can't realistically expect Gutenberg to provide an environment that can be used by translators in the short term. The best we can do is to improve the extraction, translation and localization processes. giving translators the opportunity to alter markup when it makes sense to do so.
Note: Google’s automatic web page translator handles inner tags. It may not produce the best translation, but it certainly is easy to use. If we extract the translatable text in sensible sized chunks we could easily make use of Google's translation service to give the human translators a head start.
Requirements
translate="no"
.Optionally,