jason-fox / fox.jason.translate.xliff

DITA-OT plug-in to create, auto-translate and re-merge XLIFF files, generating translated documentation in a targeted foreign language.
https://jason-fox.github.io/dita-ot-plugins/translate.xliff
Apache License 2.0
10 stars 4 forks source link

'glossrefs' are missing from the translation #4

Closed jason-fox closed 3 years ago

jason-fox commented 3 years ago

After translating a longer document, I realize that: 'glossrefs' are missing from the translation.

I saw the file 'no-translate-elements.xsl' which seems to specify elements which are not translated. After the translation the <codeph>-tag is missing

<p> inside <ul> <li> are missing: everything is 1 long line

jason-fox commented 3 years ago

@Tiemichael - I have raised this as a separate issue. Please supply a minimal example topic where the problem occurs.

Tiemichael commented 3 years ago

my main.ditamap includes:

<topichead navtitle="Glossary" chunk="to-content" outputclass="page-break-before"
        product="SCPI_Manual_only">
        <topicgroup outputclass="glossarylist">
            <glossref href="topics/g_MNEM.dita" keys="SCPI.MNEM" print="yes" toc="yes"/>
        </topicgroup>
    </topichead>

with topics/g_MNEM.dita:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE glossentry PUBLIC "-//OASIS//DTD DITA Glossary//EN" "glossary.dtd"[
<!-- Begin Document Specific Declarations -->
<?Fm Validation Off?>
<!-- End Document Specific Declarations -->
]>
<glossentry id="SCPI.MNEM">
    <glossterm outputclass="mmpdf:paraDef:glossentry_title">Mnemonics</glossterm>
    <glossdef outputclass="mmpdf:paraDef:glossdef"><term>Mnemonics</term> are keywords by which each
        of the instrument's subsystems is referred to. </glossdef>
</glossentry>

After translation to de: main.ditamap:

   <topichead
              navtitle="Glossary"
              chunk="to-content"
              outputclass="page-break-before"
              product="SCPI_Manual_only"
              class="+ map/topicref mapgroup-d/topichead ">
      <topicgroup outputclass="glossarylist"
                  class="+ map/topicref mapgroup-d/topicgroup ">

         <glossref href="topics/g_MNEM.dita"
                   keys="SCPI.MNEM"
                   print="yes"
                   toc="yes"
                   linking="none"
                   search="no"
                   class="+ map/topicref glossref-d/glossref "/>

      </topicgroup>
   </topichead>

File topics/g_MNEM.dita does not exist.

jason-fox commented 3 years ago

<glossterm> and <glossdef> have been added to the list of translatable elements - see here. I've also added new test cases. Please reinstall from master:

dita uninstall fox.jason.translate.xliff
dita install https://github.com/jason-fox/fox.jason.translate.xliff/archive/master.zip

I saw the file 'no-translate-elements.xsl' which seems to specify elements which are not translated. After the translation the <codeph>-tag is missing

Yes, you can add or remove dita elements from no-translate-elements.xsl, but by default <codeph> is not translated. I ran three transforms create-translate-dita over this test-case with <codeph> and I could see <codeph> in the output.

<p> inside <ul> <li> are missing: everything is 1 long line

Test-case added here

   <unit  id="54947" fs:fs="li">
         <originalData>
            <data id="sd4e16">&lt;b&gt;</data>
            <data id="ed4e16">&lt;/b&gt;</data>
         </originalData>
         <segment state="initial">
            <source xml:space="preserve" xml:lang="en">Loves or pursues or desires to obtain <pc id="d4e16" dataRefStart="sd4e16" dataRefEnd="ed4e16" fs:fs="b">pain of itself</pc>, because it is pain, but occasionally circumstances occur in which toil and pain can procure him some great pleasure. </source>
            <target xml:lang="de"/>
         </segment>
      </unit>
      <unit  id="2885" fs:fs="li">
         <originalData>
            <data id="sd4e22">&lt;p&gt;</data>
            <data id="ed4e22">&lt;/p&gt;</data>
            <data id="sd4e24">&lt;p&gt;</data>
            <data id="ed4e24">&lt;/p&gt;</data>
         </originalData>
         <segment state="initial">
            <source xml:space="preserve" xml:lang="en"> <pc id="d4e22" dataRefStart="sd4e22" dataRefEnd="ed4e22" fs:fs="p">To take a trivial example, which of us ever undertakes laborious physical exercise, except to obtain some advantage from it? </pc> <pc id="d4e24" dataRefStart="sd4e24" dataRefEnd="ed4e24" fs:fs="p">But who has any right to find fault with a man who chooses to enjoy a pleasure that has no annoying consequences, or one who avoids a pain that produces no resultant pleasure? </pc> </source>
            <target xml:lang="de"/>
         </segment>
      </unit>

Each root <li> is a translation unit. There is no distinction between an <li> containing an inline <b>(example 1) or an <li> containing a series of block level <p> elements (example 2).

Tiemichael commented 3 years ago

Thank you for looking into these issues! I can confirm that 'glossrefs' are working with the new version. But I still have issues with <codeph> and <p> and others missing inside <ul> <li>

Attached are 2 complete dita-refence files showing the issues. before_c_scpi_prog.txt is the English original, after_c_scpi_prog.txt is the translated German version.

before_c_scpi_prog.txt after_c_scpi_prog.txt

(the file extension was changed from .dita to .txt)

Maybe you find time to take a look Thanks!

jason-fox commented 3 years ago

I can't tell from the start and end *.dita where in the translation chain the <codeph> is lost.

Could you provide the translation unit from the translation.xlf file for:

<p>
    The program requires the installation of Python3.6 or higher on a host device (eg
    Windows PC, Linux Workstation) or directly on the eLABin1 (which is not explained
    here in further detail). After extracting the zip-package and installing additional
    python-libraries if needed the program can be started with eg. <codeph>python3
    SCPI_GEN.py</codeph>. 
</p>
  1. After xliff-create
  2. After xliff-translate using the dummy translation
  3. After xliff-translate using the bing translation
Tiemichael commented 3 years ago

As requested: snippet.txt

jason-fox commented 3 years ago

After translate: dummy - dataRefStart and dataRefEnd are all correctly CamelCase

<unit id="25340" fs:fs="p">
   <originalData>
      <data id="sd4e64">&lt;codeph&gt;</data>
      <data id="ed4e64">&lt;/codeph&gt;</data>
   </originalData>
   <segment state="translated">
      <source xml:space="preserve" xml:lang="en">
         The program requires the installation of Python3.6 or higher on a
          host device (eg Windows PC, Linux Workstation) or directly on the 
          eLABin1 (which is not explained here in further detail). After 
          extracting the zip-package and installing additional 
          python-libraries if needed the program can be started with eg. <pc 
          id="d4e64" dataRefStart="sd4e64" dataRefEnd="ed4e64" 
          fs:fs="code"><mrk translate="no" type="term" id="md4e64">python3 
          SCPI_GEN.py</mrk></pc>. 
      </source>
      <target xml:lang="de">
         The program requires the installation of Python3.6 or higher on a 
         host device (eg Windows PC, Linux Workstation) or directly on the 
         eLABin1 (which is not explained here in further detail). After 
         extracting the zip-package and installing additional python-libraries
          if needed the program can be started with eg. <pc id="d4e64" 
          dataRefStart="sd4e64" dataRefEnd="ed4e64" fs:fs="code"><mrk 
          translate="no" type="term" id="md4e64">python3 
          SCPI_GEN.py</mrk></pc>.
      </target>
   </segment>
</unit>

After translate: bing - datarefstart and datarefend are all lower-case

<unit id="25340" fs:fs="p">
   <originalData>
      <data id="sd4e64">&lt;codeph&gt;</data>
      <data id="ed4e64">&lt;/codeph&gt;</data>
   </originalData>
   <segment state="translated">
      <source xml:space="preserve" xml:lang="en">
         The program requires the installation of Python3.6 or higher on a 
         host device (eg Windows PC, Linux Workstation) or directly on the 
         eLABin1 (which is not explained here in further detail). After 
         extracting the zip-package and installing additional python-libraries
          if needed the program can be started with eg. <pc id="d4e64" 
          dataRefStart="sd4e64" dataRefEnd="ed4e64" fs:fs="code"><mrk 
          translate="no" type="term" id="md4e64">python3 
          SCPI_GEN.py</mrk></pc>.
      </source>
      <target xml:lang="de">
         Das Programm erfordert die Installation von Python3.6 oder höher auf einem
          Host-Gerät (zB Windows PC, Linux Workstation) oder direkt auf dem
          eLABin1 (was hier nicht näher erläutert wird). Nach dem
          Extrahieren des Zip-Pakets und Installieren zusätzlicher Python-Bibliotheken
           Bei Bedarf kann das Programm mit z. <pc id="d4e64"
           datarefstart="sd4e64" datarefend="ed4e64" fs:fs="code"> <mrk
           translate="no" type ="term" id="md4e64"> python3
           SCPI_GEN.py </mrk> </pc>.
      </target>
   </segment>
</unit>
Tiemichael commented 3 years ago

Error caused by Bing?

jason-fox commented 3 years ago

Error caused by Bing?

Yes, but I will amend the code to compensate for this. I take it that dummy reconstitutes the <codeph> in the *.dita for you, since that is one of the test cases.

Tiemichael commented 3 years ago

Yes, using dummy reconstitutes the <codeph> in the *.dita Great, Thanks!

jason-fox commented 3 years ago

I've added the code change, please pull:

dita uninstall fox.jason.translate.xliff
dita install https://github.com/jason-fox/fox.jason.translate.xliff/archive/master.zip

However I can't test the change directly as:

curl -L -X POST 'https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&to=de&from=en' \
-H 'Ocp-Apim-Subscription-Key: <api-key> \
-H 'Content-Type: application/json' \
--data-raw '[{"Text":" The program requires the installation of Python3.6 or higher on a host device (eg Windows PC, Linux Workstation) or directly on the eLABin1 (which is not explained here in further detail). After extracting the zip-package and installing additional python-libraries if needed the program can be started with eg. <pc id=\"d4e64\" dataRefStart=\"sd4e64\" dataRefEnd=\"ed4e64\" fs:fs=\"code\"><mrk translate=\"no\" type=\"term\" id=\"md4e64\">python3 SCPI_GEN.py</mrk></pc>."}]'

is already returning the correct casing for me:

[
    {
        "translations": [
            {
                "text": " Das Programm erfordert die Installation von Python3.6 oder höher auf einem Hostgerät (z.B. Windows PC, Linux Workstation) oder direkt auf dem eLABin1 (was hier nicht näher erläutert wird). Nach dem Extrahieren des Zip-Pakets und der Installation zusätzlicher Python-Bibliotheken bei Bedarf kann das Programm z.B. mit dem Programm gestartet werden. <pc id=\"d4e64\" dataRefStart=\"sd4e64\" dataRefEnd=\"ed4e64\" fs:fs=\"code\"><mrk translate=\"no\" type=\"term\" id=\"md4e64\">python3 SCPI_GEN.py</mrk></pc>.",
                "to": "de"
            }
        ]
    }
]

Maybe this is to do with how you have set up curl on your machine ????

Tiemichael commented 3 years ago

Thank you! The translation is working fine now. The curl-command above gives me the same answer as you have posted:

 -H 'Ocp-Apim-Subscription-Key: 1234567' \
 -H 'Ocp-Apim-Subscription-Region: southeastasia' \
 -H 'Content-Type: application/json' \
 --data-raw '[{"Text":" The program requires the installation of Python3.6 or higher on a host device (eg Windows PC, Linux Workstation) or directly on the eLABin1 (which is not explained here in further detail). After extracting the zip-package and installing additional python-libraries if needed the program can be started with eg. <pc id=\"d4e64\" dataRefStart=\"sd4e64\" dataRefEnd=\"ed4e64\" fs:fs=\"code\"><mrk translate=\"no\" type=\"term\" id=\"md4e64\">python3 SCPI_GEN.py</mrk></pc>."}]'

[{"translations":[{"text":" Das Programm erfordert die Installation von Python3.6 oder höher auf einem Hostgerät (z.B. Windows PC, Linux Workstation) oder direkt auf dem eLABin1 (was hier nicht näher erläutert wird). Nach dem Extrahieren des Zip-Pakets und der Installation zusätzlicher Python-Bibliotheken bei Bedarf kann das Programm z.B. mit dem Programm gestartet werden. <pc id=\"d4e64\" dataRefStart=\"sd4e64\" dataRefEnd=\"ed4e64\" fs:fs=\"code\"><mrk translate=\"no\" type=\"term\" id=\"md4e64\">python3 SCPI_GEN.py</mrk></pc>.","to":"de"}]}]

Strange ....