scotartt / commentarius

De Commentariis is a web app that allows social annotations and commentaries to be created on ancient texts in TEI-compliant formats
GNU General Public License v2.0
5 stars 1 forks source link

xslt transformation #1

Closed scotartt closed 10 years ago

scotartt commented 10 years ago

even in properly formatted texts (e.g. metadata clearly defined and already parsed in header) there are issues of how the text is formatted.

For example, consider poetry, such a Horace Carmina

<div1 type="Book" n="1">
<div2 type="Poem" n="1" met="First Asclepiadean">
<l>Maecenas atavis edite regibus,</l>
<l>o et praesidium et dulce decus meum:</l>
<l>sunt quos curriculo pulverem Olympicum</l>

Each individual line is held in the "l" tag. Even in the one file, there are other divisions - this one uses "lg" tag to group lines in stanzas:

<div2 type="Poem" n="2" met="Sapphic Strophe">
<lg type="stanza"><l>Iam satis terris nivis atque dirae</l>
<l>grandinis misit pater et rubente</l>
<l>dextera sacras iaculatus arcis</l>
<l>terruit urbem,</l></lg>
<lg><l>terruit gentis, grave ne rediret</l>
<l>saeculum Pyrrhae nova monstra questae,</l>
<l>omne cum Proteus pecus egit altos</l>
<l>visere montis,</l></lg>

Lines frequently will not contain the "type" paramater, and will often only number every fifth or tenth line (this is urn:cts:latinLit:phi0893.phi004.perseus-lat1:1.1)

metadata

<refsDecl doctype="TEI.2">
    <step refunit="book" from="DESCENDANT (1 DIV1 N %1)"/>
    <step delim="." refunit="poem" from="DESCENDANT (1 DIV2 N %2)" n="chunk"/>
    <step refunit="line" from="DESCENDANT (1 L N %3)"/>
</refsDecl>

content

<div1 type="book" n="1">
<div2 type="poem" n="1">
<head>Ad Maecenatem:  Omnibus, maxime vero avaris, sortem suam gravem esse.</head>
<l>Qui fit, Maecenas, ut nemo, quam sibi sortem </l>
<l>seu Ratio dederit seu Fors obiecerit, illa </l>
<l>contentus vivat, laudet diversa sequentis? </l>
<l>'o fortunati mercatores' gravis annis </l>
<l n="5">miles ait, multo iam fractus membra labore.</l>
scotartt commented 10 years ago

Need to xslt transform this so that "book 1" and "poem 1" are available as body text, the div1 and div2 tags are regular divs with appropriate css styles (suggestion: book and poem good candidates).

the "l" and the "lg" need to be appropriately transformed (use span with a style for "l")

also need to check out "stanza", "pb" (page) and "milestone", e.g.;

<milestone ed="p" n="1" unit="card"/>

(this is in a text whose metadata is described as)

<encodingDesc>
    <refsDecl doctype="TEI.2">
        <state unit="line"/>
    </refsDecl>
    <refsDecl doctype="TEI.2">
        <state n="chunk" unit="card"/>
    </refsDecl>
</encodingDesc>
scotartt commented 10 years ago

Can now display the line numbers and other in-text annotations. CSS'd many of the most egregious offenders in the XML markup.