Open JonathanReeve opened 6 years ago
This makes sense to me. I can imagine a few different visualizations. Where it gets more interesting is deciding what such ambiguity means for a statistical analysis of speakers and their speech. (I suppose, too, that Joyce’s male characters are too poorly distinguished for such analysis to resolve the ambiguity of who is speaking?)
btw I labelled the xml:id with the line number followed by the word unclear (in case we end up using IDs for other purposes). Happy to revert if that's cumbersome.
<lb n="060004"/><said xml:id="060004_unclear" who="mc">―Come on, Simon.
<certainty target="#060004_unclear" match="@who" […]
How do we attribute dialogue in an exchange between several people ? There’s a spot like this in Hades where no speakers are given for several lines of dialogue:
<lb n="060114"/><said who="lb">―I met M'Coy this morning,</said> Mr Bloom said. <said who="lb">He said he'd try to come.</said></p>
<p><lb n="060115"/>The carriage halted short.
<lb n="060116"/><said who="unclear">―What's wrong?</said>
<lb n="060117"/><said who="unclear">―We're stopped.</said>
<lb n="060118"/><said who="unclear">―Where are we?</said></p>
<p><lb n="060119"/>Mr Bloom put his head out of the window.
<lb n="060120"/><said who="lb">―The grand canal,</said> he said.</p>
The unclears can only be Cunningham, Power or Simon Dedalus (with Bloom, perhaps, chiming in at U 6.117). How best would that be encoded?
On second thoughts, moving the question of encoding an unattributed exchange to #19.
A user started working on this issue via WorksHub.
We're encoding certainty like this:
...but this obviously shouldn't appear in the transformed HTML output.
Ideally, when displaying the dialogue tags for this, we'd have some kind of dotted line or something to indicate fuzziness.