Open annettegessner opened 8 years ago
Drama should be by line. Which does not mean that chores and other divs should not be in the XML : We just simply do not use it in the final xpath /TEI/TEXT/BODY/DIV//L[@n="$1"]
:)
Cool, so this means, we can leave the files as they are, as long as the lines are there and we just have to change the xPath, right? B-)
So the refsDecl for all drama should look like this:
<cRefPattern n="line" matchPattern="(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div//tei:l[@n='$1'])"/>
FYI @planatheisa (EDITED according to Thibault's next comment)
And maybe you can drop @type="edition"
? :)
By the way @annettegessner, do you know that
`<a b="1">`
actually formats for xml
<a b="1">
Ooops! I am trying this for those two files right now and if it works, I'll do that for ALL our files, opening a new issue.
Niiice, thanks for the tip with ´´´xml !
This may be relevant I believe: there are a handful of instances in drama where two characters will speak on the same line. There are two in the Seven Against Thebes, for instance. Because we're nesting every line within a
About duplicate line numbering: In case of two lines wrongly having the same line number, add a linebreak, e.g.: replace
<l n="236">δην, στόματός τε καλλιπρῴ-</l>
<l n="236">ρου φυλακᾷ κατασχεῖν</l>
with
<l n="236">δην, στόματός τε καλλιπρῴ-
<lb/>ρου φυλακᾷ κατασχεῖν</l>
This way of fixing issues like this has been approved by Thibault.
About two characters are speaking the same line: I like the solution of @jduff-chs to add a and b to the line numbers. If the others are alright with this, I would keep it this way.
This solution is CTS-compliant. The question is whether people will be able to find lines 980a and 980b when trying to cite them using CTS. I think I might lean more towards keeping them in a single <l>
tag, something like this:
<l n=980>
<sp>
<speaker>NAME</speaker>
TEXT
</sp>
<sp>
<speaker>NAME</speaker>
TEXT
</sp>
</l>
That may not validate because <sp>
is not allowed under <l>
or because there can be no TEXT directly under an <sp>
tag. The second would be easy to fix by putting the TEXT inside another tag, e.g., <seg>
, <p>
, or even <l>
. The first would be more difficult to solve.
What problems am I not seeing with my solution?
@sonofmun I believe the first issue may be true - based on the way the structure is defined, it was my understanding that all
It is indeed an issue with the way
A note as discussed in PR #271: in some cases, two sections of a drama will be numbered separately (in this instance, the Prologus and the main body of the play). To avoid duplicate lines and a failure in Hook Test, after separating these two sections as a layer of divs below edition, use a refsDecl as follows:
<refsDecl n="CTS">
<cRefPattern n="line" matchPattern="(.+)(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div/tei:div[@n='$1']//tei:l[@n='$1'])"/>
<cRefPattern n="section" matchPattern="(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div/tei:div[@n='$1'])"/>
</refsDecl>
Note: In cases like this Lucian Vol. 2, we decided for the following schema:
<div type="textpart" subtype="section" n="1">
<sp>
<speaker>ΧΑΡΩΝ</speaker>
<p>Εἶεν, ὦ Κλωθοῖ, το […] </p>
</sp>
</div>
@sonofmun wrote: " I think we need an <sp>
tag to show it is a speech. Then we need the <speaker>
tag to show who is speaking. And then we need a <p>
tag to contain the actual text of the speech."
FYI, we had a discussion here, but it may or may not be applicable, and not sure we ever got around to documenting this as thoroughly as needed. https://github.com/PerseusDL/canonical-greekLit/pull/129
see #103 and #81