godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
91.5k stars 21.26k forks source link

Write Python script to generate translation catalog (.pot file) from the XML class reference #37109

Closed akien-mga closed 4 years ago

akien-mga commented 4 years ago

This is the description of a task that would be necessary to properly solve godotengine/godot-docs#3162. It should be relatively easy to do for someone familiar with Python and XML parsing.

To allow translating our class reference (XML files in doc/classes/*.xml but also modules/*/doc_classes/*.xml), we need to generate a gettext translation catalog (.pot file) that includes the description strings from all XML files.

I can then define a new project for it on https://hosted.weblate.org/projects/godot-engine/ ("Godot Class Reference") to collect translations from the community, and we can look in a second step how to load them in the editor for offline docs, and in the online version of the class reference on localized docs websites.


To do this first step of extracting strings from the XML to generate a .pot file, I suggest using Python and taking inspiration/code from the existing scripts we have:

So basically a mix of the XML parsing and the .pot writing needs to be done.

The output file (probably doc/translations/classes.pot) should include the strings for all classes, ordered alphabetically and following the same order as in the XML itself.

For the extracted descriptions, the tab XML indentation should be stripped, but space indentation inside [codeblock] tags should be preserved. Double quotes should be escaped (\"), and newlines in the XML should be converted to \n in the msgid. gettext msgids ignore line breaks, so where line breaks are necessary, they need to be explicit (and translators will have to keep them in place too).

The msgids should be wrapped at 79 characters (use msgmerge for that, see extract.py).

The context comment should give the source file's location and the relevant line. Multi-line descriptions should be kept in a single msgid to make it easier to translate with the full context (and thus with explicit line breaks \n).

E.g. for Sprite.xml, the relevant section might start this way:

#: doc/classes/Sprite.xml:4
msgid "General-purpose sprite node."
msgstr ""

#: doc/classes/Sprite.xml:7
msgid ""
"A node that displays a 2D texture. The texture displayed can be a region "
"from a larger atlas texture, or a frame from a sprite sheet animation."
msgstr ""

#: doc/classes/Sprite.xml:16
msgid ""
"Returns a [Rect2] representing the Sprite's boundary in local coordinates. "
"Can be used to detect if the Sprite was clicked. Example:\n"
"[codeblock]\n"
"func _input(event):\n"
"    if event is InputEventMouseButton and event.pressed and event."
"button_index == BUTTON_LEFT:\n"
"        if get_rect().has_point(to_local(event.position)):\n"
"            print(\"A click!\")\n"
"[/codeblock]"
msgstr ""

I tag it "junior job" as it should be fairly easy to do, but it requires some familiarity with Python. Any volunteers is welcome to ask me for more details :)

akien-mga commented 4 years ago

To allow translating our class reference (XML files in doc/classes/*.xml but also modules/*/doc_classes/*.xml), we need to generate a gettext translation catalog (.pot file) that includes the description strings from all XML files.

BTW, while our current documentation workflow is designed so that modules can include their own docs in module folder, for the purpose of this translation effort, I want the all classes of official modules collated in the same classes.pot.

Providing a way for custom modules to ship with localized documentation would be nice, but it's out of scope for the initial implementation of localized docs. Even if it could be done relatively easily from the start, it adds complexity in the Weblate workflow as we'd have to multiply the number of .pot files and thus of translation projects.

rsubtil commented 4 years ago

BTW would it be possible to also extract the chunks of code on the docs? Most of them have comments that can also be translated.

akien-mga commented 4 years ago

BTW would it be possible to also extract the chunks of code on the docs? Most of them have comments that can also be translated.

If you mean code blocks in tutorials like in https://docs.godotengine.org/ja/latest/getting_started/step_by_step/signals.html#timer-example, that's not related to this issue, which is about the XML class reference. I'd suggest opening an issue about this on https://github.com/godotengine/godot-docs.

kuruk-mm commented 4 years ago

I'm going to work on this if there are not problem.