Open akien-mga opened 4 years ago
The rules for inline markup recognition are in http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#inline-markup-recognition-rules
They do mention:
For languages that don't use whitespace between words (e.g. Japanese or Chinese) it is recommended to set simple-inline-markup to True and eventually escape inline markup characters. The examples breaking rules 6 and 7 above show which constructs may need special attention.
But that's for docutils, I'm not sure we can enable it for Sphinx, nor if it would be wise as we have a lot of code examples that should then be escaped manually?
Also found "Gotchas" here: https://www.sphinx-doc.org/en/2.0/usage/restructuredtext/basics.html#gotchas
Separation of inline markup: As said above, inline markup spans must be separated from the surrounding text by non-word characters, you have to use a backslash-escaped space to get around that. See the reference for the details.
That's probably what Japanese and Chinese translators should then to avoid having to put extra visible spaces around links.
that would be usefull for general purpose
i am meeting dead links for English content
i have done a check pass with a limited online tool (brokenlinkcheck), which found broken links:
https://www.360toolkit.co/convert-cubemap-to-spherical-equirectangular.html https://github.com/godotengine/godot/blob/master/drivers/gles2/shaders/copy.glsl https://github.com/GodotNativeTools/GDNative-demos/tree/master/c/SimpleDemo https://aur.archlinux.org/packages/mingw-w64-gcc/ https://github.com/godotengine/godot/blob/master/core/pool_vector.cpp https://github.com/godotengine/godot/blob/master/scene/audio/audio_player.cpp https://github.com/godotengine/godot/blob/master/core/message_queue.cpp https://godot.eska.me/irc-logs/ https://godot.readthedocs.io/en/latest/tutorials/misc/running_code_in_the_editor.html https://blog.escapecreative.com/customizing-mailto-links/ https://docs.godotengine.org/en/latest/tutorials/viewports/multiple_resolutions.html https://docs.godotengine.org/en/latest/getting_started/workflow/assets/importing_images.html https://docs.godotengine.org/en/latest/classes/class_@c https://docs.godotengine.org/en/latest/getting_started/workflow/export/feature_tags.html
some third party links are obsolete, some gh resources got slightly moved but are now broken links in docs, or C# class reference link is dead (docs generating a link with unescaped # char) for example
This is mainly needed for https://github.com/godotengine/godot-docs-l10n but that repo is mostly used for practical reasons, actually discussion should likely happen here.
I've noticed that in our docs translations, it's common to find broken links, e.g.:
![Screenshot_20190725_104050](https://user-images.githubusercontent.com/4701338/61859581-ad8e7f00-aec8-11e9-9ac7-425f6ff04d73.png)
In the above two examples, that's due either to formatting issues (Japanese doesn't separate words with spaces, but reST seems to require a space after the trailing
_
of an external link) or translator mistake due to not being familiar with the markup (the French translator simply removed the markup).I guess Sphinx might be able to warn about some of these, but likely not all cases (e.g. the French example above would not be seen as invalid formatting).
I think the best would be to have a script (likely in Python) that I could use to go over all
.po
files in https://github.com/godotengine/godot-docs-l10n/tree/master/weblate to find such formatting issues or mismatches between source stringmsgid
and translationmsgstr
. For mismatches, it could be that gettext already has a feature for that, though I'm not sure it would support reST-specific markup (it does have a feature to warn about mismatch in trailing spaces or newlines for example)..po
files are wrapped, so to parse the markup properly the parser should be able to unwrap the lines and consider the whole string. It could then also be used on the.pot
template to check for badly formatted links in the English source.Any Python and parser-loving volunteer? :P