SASDigitalHumanitiesTraining / TextEncoding

Text Encoding for Ancient and Modern Literature, Languages and History
9 stars 5 forks source link

XSLT to add @xml:id to elements that don't have them? #32

Open bitparity opened 2 years ago

bitparity commented 2 years ago

So this would be a sample XML file

<body>
   <text>
      <p xml:id="l-01">line 1</p>
      <p>line 2</p>
      <p>line 3</p>
      <p>line 4</p>
   </text>
</body>

So what I'd be interested in is an XSLT file that looks at all <p> elements, checks to see if an @xml:id exists, if one doesn't, to add and generate one.

So the transformed XML file would look like below, but with the understanding the @xml:id might be another generated ID (so as to keep things simple).

<body>
   <text>
      <p xml:id="l-01">line 1</p>
      <p xml:id="l-02">line 2</p>
      <p xml:id="l-03">line 3</p>
      <p xml:id="l-04">line 4</p>
   </text>
</body>
gabrielbodard commented 2 years ago

Suggested answer that may get you part of the way there:

<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" version="2.0">

    <xsl:output method="xml" version="2.0" encoding="UTF-8"/>

    <xsl:template match="tei:*">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="tei:p[not(@xml:id)]">
        <xsl:copy>
            <xsl:attribute name="xml:id" select="generate-id()"/>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>    

</xsl:stylesheet>
cmohge1 commented 2 years ago

For more info: https://www.w3schools.com/xml/func_generateid.asp

bitparity commented 2 years ago

One more question, how would I go about copying the processing instructions? Considering that most of the documents I'll be working with will be in TEI and have namespace declarations.

gabrielbodard commented 2 years ago

My "copy-all" stylesheets usually include the additional template:

<xsl:template match="processing-instruction() | comment()">
    <xsl:copy>
      <xsl:value-of select="."/>
    </xsl:copy>
  </xsl:template>

(Which copies comments as well as processing instructions, as is probably obvious.)