dita-ot / dita-ot

DITA Open Toolkit — the open-source publishing engine for content authored in the Darwin Information Typing Architecture.
https://www.dita-ot.org
Apache License 2.0
401 stars 196 forks source link

Inconsistent treatment of trademarks in HTML and PDF outputs #3798

Open raducoravu opened 3 years ago

raducoravu commented 3 years ago

TestTMElement.zip

Feedback from one of our end users:

Recently we bumped into this issue in DITA-OT 3.6.1 where the tm superscript is not printed when there is a repeated value use for trademark attribute and element data. I have attached the sample content that can demonstrate the issue. It only happens to html based output. It is somehow “remember” values used before in trademark attribute + element data and it won’t work for any repeating value of such existing combinations.

For example in cases like this:

<topic id="test" xml:lang="en-us">
  <title>Test case for TM element</title>
  <body>
    <p>
      <tm tmtype="tm" trademark="ABCY">ABC3</tm>
    </p>
    <p>
      <tm tmtype="tm" trademark="ABCY">ABC4</tm>
    </p>
  </body>
</topic>

in the HTML based outputs the second trademark symbol is skipped although it applies to a different text. For PDF both are taken into account.

Looking at how the XSLT stylesheets for HTML5 publishing interpret the element dita-ot/plugins/org.dita.html5/xsl/topic.xsl in the template:

 <xsl:template match="*[contains(@class, ' topic/tm ')]" name="topic.tm">

it seems to mimic a behavior in which subsequent trademark symbols are not output. For PDF output all trademark symbols are output. Our client expects the HTML output to behave in the same way.

The DITA 1.3 specification says nothing about how trademark symbols should be interpreted in the published output.

infotexture commented 3 years ago

@raducoravu See https://github.com/dita-ot/dita-ot/issues/2065 for related discussion.

raducoravu commented 3 years ago

@infotexture so according to what was done in #2065, the inconsistency between HTML and PDF still exists, in HTML the second tm is usually ignored, but before the fix for #2065 for certain languages the tm was always ignored for HTML output.

raducoravu commented 1 year ago

👍 https://www.oxygenxml.com/forum/viewtopic.php?p=70576#p70576

  Is there a setting which will allow us to display the trademark symbol only to the first occurrence of the product name in each topic in the documentation? 

In the "org.dita.pdf2/xsl/fo/topic.xsl" this template always returns true:

<xsl:template match="node() | @*" mode="tm-scope" as="xs:boolean" priority="-10">
  <xsl:sequence select="true()"/>
</xsl:template>  
raducoravu commented 1 year ago

The HTML algorithm which attempts to avoid outputting consecutive tm symbols in a topic seems to have problems as well for example here "org.dita.html5/xsl/topic.xsl"

 <xsl:when test="preceding::*[contains(@class, ' topic/tm ')][@trademark = $tmvalue][ancestor::*[contains(@class, ' topic/body ') or contains(@class, ' topic/shortdesc ')]]">skip</xsl:when>

If in my topic I have something like this, the preceding:* will not find any other tm elements as they are in different paragraphs.

        <!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
        <topic id="introduction">
            <title>Introduction</title>
            <body>
                <p><tm tmtype="tm">Merative</tm></p>
                <p><tm tmtype="tm">Merative</tm></p>
                <p><tm tmtype="tm">Merative</tm></p>
            </body> 
        </topic>