open-editions / corpus-joyce-ulysses-tei

James Joyce's novel Ulysses in TEI XML. Work-in-progress.
20 stars 17 forks source link

<listPerson>ing the speakers #25

Closed yellwork closed 7 years ago

yellwork commented 7 years ago

I’m reading through the guidelines for <listPerson> and its associated elements and trying to figure out how best we might capture the speakers of Ulysses. (An unaddressed question here is whether or not we want to include characters named in the novel but who do not speak and, in turn, whether or not we plan to encode all instances of character mentions in the novel.)

The generic speaker is relatively straightforward:

<person xml:id="db" sex="M" age="mid">
<persName>Davy Byrne</persName>
</person>

But what other information do we want to include? Do we disambiguate the <persName>?

<person xml:id="db" sex="M" age="mid">
<persName>
<surname>Byrne</surname>
<forename>Davy</forename>
</persName>
<occupation>publican</occupation>
</person>

Do we want to draw on the impressive scholarship on historical people who appear in Ulysses (like the recent The Real People of Joyce’s “Ulysses“: A Biographical Guide by Vivien Igoe [2016]) and specify that Byrne was born in ABT 1860, for example? Do we record that “Davy” is a nickname for “David”?

<forename type="given">David</forename>
<addName type="nick">Davy</addName>

We could also record a <listRelation> between Davy Byrne and his curate (@xml:id="db"):

<listRelation type="social">
<relation name="employer" active="#db" passive="#dbc"/>
</listRelation>

Is it worth indicating, in the <person> description, the episodes of the novel that a given character appears in? Whether as speaker, unspeaking character, or through a mention? (How would we go about doing that?!)

I’m going to leave the discussion at this for now, until we agree on a base-line level of detail (granularity?) for generic speakers / those who only manage a line or two. After that, I’ll write up some of the more complicated examples and edge cases.

yellwork commented 7 years ago

Between the eighteen episodes and our <said> and <sp> encodings, there are ~360 speakers in the corpus [list]. The inevitable mistakes and gaps aside, some of these might be flattened to a group speaker:

Burton diner_1 Burton diner_2 Burton diner_3 Burton diner_4 Burton diner_5 Burton diner_6

Others, like the multiple speakers labelled ‘A Voice’ in ‘Circe’, might need further disambiguation. Just FYI when it comes to thinking about our <listPerson>.

yellwork commented 7 years ago

Closing this issue for now. It can always be reopened once we start the <listPerson> conversation again.