mquinson / po4a

Maintain the translations of your documentation with ease (PO for anything)
http://po4a.org/
GNU General Public License v2.0
121 stars 58 forks source link

Asciidoc: Issues with inline macros #374

Closed oliverrahner closed 2 years ago

oliverrahner commented 2 years ago

I am working with Asciidoc files, and I want to fine-tune the way some macros have to be translated.

Example file:

== List

[cols=",,",options="header",]
|===========================================================================================================================
|Message Type |Information Type |Description
|xref:linktoanother.adoc[Link 1 Description] |Type | Some description
|===========================================================================================================================

This is another link: xref:linktoyetanother.adoc[Link 2 Description]

xref:linktoathird.adoc[Link 3 Description]

Command line: po4a-gettextize -f asciidoc -o tablecells -o macro=xref[1] -m test.adoc -d

Issue 1: The regex recognizing macros is hardcoded to only recognize block macros, even though macrotype is already implemented for both types: https://github.com/mquinson/po4a/blob/master/lib/Locale/Po4a/AsciiDoc.pm#L809 ==> (::) should rather be (::?)

Issue 2: After fixing the above, I do get a translation entry for Link 3, but not for the other two. I guess this has to do with how the lines are split for translation? There might be two approaches:

jnavila commented 2 years ago

For link 1, it will not work because table cell processing is very crude and does not process in formats cells. This is really useful for simple tables.

For link 2, It is true that only block macros are processed. Your change will not do much because the regex checks for begining of line (specific to block macro). Also, the line is fully available for translation as a single paragraph in the po file, and po4a does not process inline formatting. What would be the expected po file according to you?

oliverrahner commented 2 years ago

I haven't completely understood the magic of po4a on the translate side, so maybe what I want isn't feasible at all. I want to hide as much technical noise as possible from the translators. So, for a paragraph like this:

Asciidoctor supports xref:antora:asciidoc:ui-macros.adoc[three UI element representations] out of the box, which are made from corresponding inline UI macros.

I want the following translation tokens:

Otherwise, the translator would have to know that she has to

Or is this just something that translators are used to deal with and we can ignore it here?

jnavila commented 2 years ago

Your proposition does not make sense from a translator perspective. The sentences cannot be broken out into small parts that do not have autonomous semantic meaning. A translator seeing "Asciidoctor supports" would ask "supports what?", then for "three UI element representations", is it the subject of the verb or a complement. Some languages require the whole sentence because the order/form of grammatical functions are changed.

More generally, segmentation (cutting the text into translation units) is very hard when the documentation system implements a macro system at the inline level. Macros change the way the text is displayed, and sentence lego that would make sense in the original text can prevent any form of translation (and even more with English which has little conjugation and gender agreement). In such case, only a ad hoc transformation (by people knowing the macro) can save sometimes the game.

oliverrahner commented 2 years ago

Everything you say makes perfect sense :-/ As you probably noticed, I am quite new in the translation game, but I really want to improve our company's process where we handle 10 languages (more to come) and lots of different material across a multitude of public-facing systems.

I guess I still have some things to learn about best practices.

Thanks for now!

silopolis commented 2 years ago

Hello friends,

Sorry this is a little OT... But not completely 😛

Le ven. 22 juil. 2022 à 14:27, Oliver Rahner @.***> a écrit :

Asciidoctor supports xref:antora:asciidoc:ui-macros.adoc[three UI element representations] out of the box, which are made from corresponding inline UI mac

Are you in the Asciidoctor and/or Antora team?

I'm asking 'cause we just migrated our LinuxCNC.org documentation to AsciiDoc and po4a, with translations managed on Weblate, and from the beginning I have Antora in sight as a publishing system for the various versions and languages.

Nevertheless, from my last searches, translations support in Antora wasn't built in... So, I'd be happy to discuss this with someone having a similar project.

Please forgive the noise guys have a very nice (and cool 🥵) we

TY J

oliverrahner commented 1 year ago

@silopolis No I am not, but we are using the same toolchain, so I'd also be happy to connect. You can contact me on o.rahner@dke-data.com

mquinson commented 1 year ago

If you have any patch that we could integrate in po4a to ease such goal, I'd be glad. In particular, we could maybe extend the documentation somehow?