htacg / tidy-html5

The granddaddy of HTML tools, with support for modern standards
http://www.html-tidy.org
2.72k stars 419 forks source link

Spaces eaten after inline elements (XML) #818

Open ilmari-lauhakangas opened 5 years ago

ilmari-lauhakangas commented 5 years ago

Reduced test snippet from LibreOffice help:

<?xml version="1.0" encoding="UTF-8"?>
<helpdocument version="1.0">
<meta>
  <topic id="textswriter0102110000xml" indexer="include">
    <title id="tit" xml-lang="en-US">Navigator</title>
    <filename>/text/swriter/01/02110000.xhp</filename>
  </topic>
</meta>
<body>
<section id="navigator">
<paragraph id="hd_id3151177" role="heading" level="1" xml-lang="en-US"><link href="text/swriter/01/02110000.xhp" name="Navigator">Navigator</link></paragraph>
<paragraph id="par_id3149802" role="paragraph" xml-lang="en-US"><ahelp hid=".">Shows or hides the Navigator window, where you can quickly jump to different parts of your document. Navigator is also available as a deck of the Sidebar. You can also use the Navigator to insert elements from the current document or other open documents, and to organize master documents.</ahelp> To edit an item in the Navigator, right-click the item, and then choose a command from the context menu. If you want, you can <link href="text/shared/00/00000005.xhp#andocken" name="dock">dock</link> the Navigator at the edge of your workspace.</paragraph>
</section>
</body>
</helpdocument>

Command: tidy -q -xml -i -w 0 02110000.xhp

Result:

<?xml version="1.0" encoding="utf-8"?>
<helpdocument version="1.0">
  <meta>
    <topic id="textswriter0102110000xml" indexer="include">
      <title id="tit" xml-lang="en-US">Navigator</title>
      <filename>/text/swriter/01/02110000.xhp</filename>
    </topic>
  </meta>
  <body>
    <section id="navigator">
      <paragraph id="hd_id3151177" role="heading" level="1" xml-lang="en-US">
        <link href="text/swriter/01/02110000.xhp" name="Navigator">Navigator</link>
      </paragraph>
      <paragraph id="par_id3149802" role="paragraph" xml-lang="en-US">
      <ahelp hid=".">Shows or hides the Navigator window, where you can quickly jump to different parts of your document. Navigator is also available as a deck of the Sidebar. You can also use the Navigator to insert elements from the current document or other open documents, and to organize master documents.</ahelp>To edit an item in the Navigator, right-click the item, and then choose a command from the context menu. If you want, you can 
      <link href="text/shared/00/00000005.xhp#andocken" name="dock">dock</link>the Navigator at the edge of your workspace.</paragraph>
    </section>
  </body>
</helpdocument>

Notice the spaces after ahelp and link closing tags are gone. Tested with tidy version 5.7.16 on Arch Linux.

geoffmcl commented 5 years ago

@ilmari-lauhakangas thanks for raising this XML issue...

I can see the missing spaces in the tidy output... certainly in a browser rendering... like, in my smaller sample, built from yours, dock the becomes dockthe... it does not look right...

Maybe it will be a similar fix to that for the html in #212... not sure...

Look forward to further feedback, especially patches, or PR, ... to address this issue... thanks...