TEIC / Stylesheets

TEI XSL Stylesheets
228 stars 124 forks source link

Default conversion docx to tei broken by redundant xml:id #610

Closed lb42 closed 8 months ago

lb42 commented 1 year ago

By default, running docxtotei on a bunch of docx files now inserts into the header for each one the following lines:

<appInfo>
    <application xml:id="docxtotei" ident="TEI_fromDOCX" version="2.15.0">
     <label>DOCX to TEI</label>
    </application>
   </appInfo>

I am not sure what the version number is for though it doesn't seem to be that of the stylesheets package. More seriously the presence of an @xml:id here means that I cannot process a bunch of converted files all together, since the @xml:id values won't be unique. What is the point of this @xml:id anyway? It simply repeats information already given by the @ident attribute.

peterstadler commented 1 year ago

I think the xml:id="docxtotei" is an anchor for the following error code: https://github.com/TEIC/Stylesheets/blob/a394b17a5951f0c24f1fbe547df9ab61d60f1951/docx/from/functions.xsl#L212-L216

NB, I believe the @resp is misspelled here as "teitodocx" – but the rationale seems to output conversion errors in the TEI file and point at the responsible agent.

peterstadler commented 1 year ago

I was also looking at the Version issue: I think we could propagate the version info from the shell script https://github.com/TEIC/Stylesheets/blob/a394b17a5951f0c24f1fbe547df9ab61d60f1951/bin/transformtei#L108 to the ANT file and further to docxtotei.xsl. Yet, I'm unsure how to set a proper fallback for users running ant (or the stylesheet) directly?

sydb commented 9 months ago

Stylesheets/docx/from/docxtotei.xsl line ~620 needs to have "2.15.0" changed to something that reads the Stylesheets/VERSION file. Stylesheets/docx/from/functions.xsl line ~213 needs to have "#docxtotei" (or whatever is right thing) instead of "#teitodocx".