OpenGreekAndLatin / First1KGreek

XML files for the works in the First Thousand Years of Greek Project. Please see our Wiki on how to contribute.
https://opengreekandlatin.github.io/First1KGreek/
Creative Commons Attribution Share Alike 4.0 International
92 stars 86 forks source link

How to handle drama? #210

Open annettegessner opened 8 years ago

annettegessner commented 8 years ago

see #103 and #81

PonteIneptique commented 8 years ago

Drama should be by line. Which does not mean that chores and other divs should not be in the XML : We just simply do not use it in the final xpath /TEI/TEXT/BODY/DIV//L[@n="$1"] :)

annettegessner commented 8 years ago

Cool, so this means, we can leave the files as they are, as long as the lines are there and we just have to change the xPath, right? B-)

annettegessner commented 8 years ago

So the refsDecl for all drama should look like this:

<cRefPattern n="line" matchPattern="(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div//tei:l[@n='$1'])"/>

FYI @planatheisa (EDITED according to Thibault's next comment)

PonteIneptique commented 8 years ago

And maybe you can drop @type="edition" ? :)

PonteIneptique commented 8 years ago

By the way @annettegessner, do you know that

`<a b="1">`

actually formats for xml

<a b="1">
annettegessner commented 8 years ago

Ooops! I am trying this for those two files right now and if it works, I'll do that for ALL our files, opening a new issue.

annettegessner commented 8 years ago

Niiice, thanks for the tip with ´´´xml !

jduff-chs commented 8 years ago

This may be relevant I believe: there are a handful of instances in drama where two characters will speak on the same line. There are two in the Seven Against Thebes, for instance. Because we're nesting every line within a tag, this needs some workaround - I've chosen to subdivide into alphabetical divisions within the line, which Hook and Nemo are okay with. See the example below, where I split line 980 so it can have portions spoken by Antigone and Ismene. Is this a suitable practice to solve this problem? screen shot 2016-06-10 at 4 14 35 pm screen shot 2016-06-10 at 4 14 49 pm

annettegessner commented 8 years ago

About duplicate line numbering: In case of two lines wrongly having the same line number, add a linebreak, e.g.: replace

<l n="236">δην, στόματός τε καλλιπρῴ-</l>
<l n="236">ρου φυλακᾷ κατασχεῖν</l>

with

<l n="236">δην, στόματός τε καλλιπρῴ-
  <lb/>ρου φυλακᾷ κατασχεῖν</l>

This way of fixing issues like this has been approved by Thibault.

annettegessner commented 8 years ago

About two characters are speaking the same line: I like the solution of @jduff-chs to add a and b to the line numbers. If the others are alright with this, I would keep it this way.

sonofmun commented 8 years ago

This solution is CTS-compliant. The question is whether people will be able to find lines 980a and 980b when trying to cite them using CTS. I think I might lean more towards keeping them in a single <l> tag, something like this:

<l n=980>
<sp>
    <speaker>NAME</speaker>
    TEXT
</sp>
<sp>
    <speaker>NAME</speaker>
    TEXT
</sp>
</l>

That may not validate because <sp> is not allowed under <l> or because there can be no TEXT directly under an <sp> tag. The second would be easy to fix by putting the TEXT inside another tag, e.g., <seg>, <p>, or even <l>. The first would be more difficult to solve. What problems am I not seeing with my solution?

jduff-chs commented 8 years ago

@sonofmun I believe the first issue may be true - based on the way the structure is defined, it was my understanding that all tags must be underneath a tag, and can't contain any tags. I'll test this now just to see what happens.

jduff-chs commented 8 years ago

It is indeed an issue with the way and are defined in XML - switching the tags in this way causes an XML error. I do agree though that perhaps there is some better way to mark these so that they are more accessible/intuitive - it's also worth noting that, as Lenny pointed out, line names like 980a are typical reserved for specific editorial uses, this not being one of them.

jduff-chs commented 8 years ago

A note as discussed in PR #271: in some cases, two sections of a drama will be numbered separately (in this instance, the Prologus and the main body of the play). To avoid duplicate lines and a failure in Hook Test, after separating these two sections as a layer of divs below edition, use a refsDecl as follows:

<refsDecl n="CTS">
        <cRefPattern n="line" matchPattern="(.+)(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div/tei:div[@n='$1']//tei:l[@n='$1'])"/>
        <cRefPattern n="section" matchPattern="(.+)" replacementPattern="#xpath(/tei:TEI/tei:text/tei:body/tei:div/tei:div[@n='$1'])"/>
      </refsDecl>
annettegessner commented 7 years ago

Note: In cases like this Lucian Vol. 2, we decided for the following schema:

<div type="textpart" subtype="section" n="1">
<sp>
<speaker>ΧΑΡΩΝ</speaker>
      <p>Εἶεν, ὦ Κλωθοῖ, το […] </p>
</sp>
</div>

@sonofmun wrote: " I think we need an <sp> tag to show it is a speech. Then we need the <speaker> tag to show who is speaking. And then we need a <p> tag to contain the actual text of the speech."

lcerrato commented 7 years ago

FYI, we had a discussion here, but it may or may not be applicable, and not sure we ever got around to documenting this as thoroughly as needed. https://github.com/PerseusDL/canonical-greekLit/pull/129