oxygenxml / dita_1_3_to_2_x_converter

Convert DITA 1.3 maps and topics to the DITA 2.x standard
Mozilla Public License 2.0
2 stars 2 forks source link

Conversion inserts empty lines #1

Closed infotexture closed 2 years ago

infotexture commented 2 years ago

When running the converter on the DITA-OT documentation source files, I noticed that the conversion leaves behind empty lines in places where deprecated markup was removed, and inserts extra empty lines along with new DITA 2.0 markup.

Expected result

 <?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA 2.x Concept//EN" "concept.dtd">
 <!--  This file is part of the DITA Open Toolkit project. See the accompanying LICENSE file for applicable license.  -->
 <concept id="ID">
   <title>Sample JSON project files</title>
-  <titlealts>
-    <navtitle>JSON project files</navtitle>
-  </titlealts>
   <shortdesc>DITA-OT includes sample project files in
     <xref keyref="json"/> format that can be used to define a publication project. Like the XML project samples, the
     sample JSON files illustrate how deliverables can be described for use in publication projects. The JSON samples are
     functionally equivalent to their XML and YAML counterparts, with minor adaptations to JSON file syntax.</shortdesc>
   <prolog>
+    <navtitle>JSON project files</navtitle>
     <metadata>
       <keywords>
         <indexterm>JSON project files</indexterm>

Observed result

 <?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA 2.x Concept//EN" "concept.dtd">
 <!--  This file is part of the DITA Open Toolkit project. See the accompanying LICENSE file for applicable license.  -->
 <concept id="ID">
   <title>Sample JSON project files</title>
-  <titlealts>
-    <navtitle>JSON project files</navtitle>
-  </titlealts>
+  
   <shortdesc>DITA-OT includes sample project files in
     <xref keyref="json"/> format that can be used to define a publication project. Like the XML project samples, the
     sample JSON files illustrate how deliverables can be described for use in publication projects. The JSON samples are
     functionally equivalent to their XML and YAML counterparts, with minor adaptations to JSON file syntax.</shortdesc>
   <prolog>
+    <navtitle>JSON project files</navtitle>
+  
     <metadata>
       <keywords>
         <indexterm>JSON project files</indexterm>
raducoravu commented 2 years ago

The refactoring stylesheet does not have xsl:output indent="yes". I'm kind of reluctant to let the stylesheet to indent the output as it has no schema information and in cases like this:

 <p><b>prefix</b><b>suffix</b></p>

it will probably consider "p" to be element only and add new lines and indentation between the inlines.

So as the stylesheet is not indenting, once it removes the titlealts element it leaves the newline and indentation which are before it in place. Maybe Oxygen's batch format and indent from the Project view could be applied on the converted DITA topics just to remove this extra indentation.

infotexture commented 2 years ago

I see. Thanks for explaining.

Then I would say this is working as designed and will close the issue.

It's certainly easy enough to clean up the results with another Format & Indent run, or tools like https://github.com/prettier/plugin-xml.