Open Phaqui opened 1 year ago
Maybe svelte-i18n can be used to handle the localisation?
Yes, I think it would be suitable. Thought I am using it in the form of svelte-intl-precompile. The API is more or less the same, but with promises of faster performance. Free performance with the same API and functionality? Always saying yes that!
The main problem for me now is taking the translation files written as XML, and converting them to JSON for use in the application. The trouble is coming up with a generic conversion script, which can handle changes and/or additions to the XML files.
After having had a look at the cgi-<translation-lang>.xml
files, it seems as you could do something like this:
So, by extracting the messages by tool, translation-lang and attribute-lang, you could make e.g. <attribute-lang>-<this_tool>-translation-lang>.json
files where the content becomes:
{
"tag1-that-has-this_tool-attribute": "tag1-that-has-this_tool-attribute.text",
...
...
"tagx-that-has-this_tool-attribute": "tagx-that-has-this_tool-attribute.text"
}
The extraction can be done everytime the appropriate .xml file is committed
The python script xmlparsing/xmltojson.py
takes as input an xml
file, and outputs a flat dictionary with keys as dot-separated strings of the tag and all attributes of each element, and the value as the corresponding localization string. There is some custom logic to not "fold down" data which includes html tags that should be part of the html shown on the page. This includes <code>
, <a>
, etc.
That was probably not a great explanation. Running it for oneself will hopefully make it a lot more clear:
cd xmlparsing
python xmltojson.py cgi-eng.xml
Now, the structure of the key strings are not special-casing the tool attribute, for example, but as long as all keys can remain unique in this way, that really should not be an issue, unless I'm missing something.
Obviously this is a work in progress, and work that remains to be done would be parsing the xsl files to find out which tools are available for which languages, among other things.
My current understanding of this topic is as follows:
Apache Forrest builds html pages from XML and XSLT. All localization data is stored in one set of XML files, and the XSLT "scripts" works like a program that weaves in data from those XML files, and outputs HTML. The XSLT files also therefore contains HTML markup for how the resulting HTML document will look. The understanding so far, is that when we move away from Apache Forrest, we do not need these XSLT files anymore, but they will be useful for reference.
That leaves us with something that is a big part of this project: Parse out localization data from the XML files, so that we don't have to do all of that manually. Some question still remains: