relate a character to a group of characters

dracor-org / dracor-schema

ODD and schemas for dracor.org files

https://dracor.org/doc/odd

5 stars 2 forks source link

relate a character to a group of characters #42

Open ingoboerner opened 1 year ago

ingoboerner commented 1 year ago

@lucagiovannini7 (via @peertrilcke ) would like to capture the information, that a single character <person> is actually part of a Group <personGrp> which could be modeled as a <relation> as well, e.g.

<relation name="isPartOf" active="{characterUri}" passive="{groupUri}"/>

@lucagiovannini7 pls provide example from epdracor, where this make sense.

We currently use the element <relation> in <listRelation> at two positions:

Family/Social relations are inside the <particDesc> in the <teiHeader>, see https://github.com/dracor-org/gerdracor/blob/main/tei/lessing-emilia-galotti.xml#L90-L94
we also use <relation> for linking the Work entity to wikidata, see https://github.com/dracor-org/gerdracor/blob/main/tei/lessing-emilia-galotti.xml#L122-L125

It would also be an option to move all the relations to the <standOff>container to make clearer, that they are somewhat more interpretative annotations and somewhat external to the text. We should have separate lists <listRelation> for the relation types, e.g. "ExternalReferenceRessources", "SocialRelations", "Parthood", ...

lucagiovannini7 commented 1 year ago

I don't have examples from EPDraCor (I've been just working on speaker disambiguation so far), but the issue sometimes comes up in my annotation of other Baroque plays, and I believe it's the same for @DanilSko in UDraCor. Which benefits would this extra annotation in standOff entail?

Actually, the problem for us was not "how to link characters back to a group", but rather the fact that some characters barely qualify for a who-tag and therefore an encoding as autonomous entities. My example would be offstage voices uttering single lines in different acts, sometimes with slightly different speaker names, and which eventually end up being tagged as voice_1, voice_2, voice_inside_1, voices_1, etc... even though one could probably consider them (parts of) the same entity, or see them a by-product of some composition norms for drama in that period. @DanilSko will provide some better examples from UDraCor.

I would restrict this (still speculative) argument to my Baroque corpus, though, since things are surely different for other ages and authors. As discussed yesterday with @peertrilcke, it is more of a (tricky) methodological issue, shaped by critical reading of the plays and understanding of their production context, than an annotation one.

In sum: it seems there's no other way than to keep encoding everyone/everything which cannot linked back to another who-tag as a new entity (and perhaps exclude them from later analyses if needed).

DanilSko commented 1 year ago

Yes, I would second Luca's point: the main issue is not the group relation, but those 'speaker' tags behind which there is actually no substantial character. Example 1 (from pushkin-boris-godunov):

Один Скоро ли царь выйдет из собора? Другой Обедня кончилась; теперь идет молебствие. <...> Третий Чу! шум. Не царь ли? Четвертый Нет, это юродивый. <...> Один из них Здравствуй, юродивый; что же ты шапки не снимаешь? (Щелкает его по железной шапке.) Эк она звонит!

This Один из них ('One of them') is clearly not a new character, but one (and we do not know which) of the previous four, and yet it is a separate network node

Example 2 (from UDraCor):

Годований. А чого, дозвольте, стоїть? Дід з ціпком. Спитай його! Хтось із людей. Чашу, святиню церковну забирають!.. Позолоту, срібло...

Хтось із людей means 'One of the people' ... UDraCor is full of cases like this. And yes, sometimes it is 'Some of them', which adds the grouping complexity to it, but the main point is not about grouping — it is about voices which formally have their own speaker tags but are not separate characters in the fictional realm

peertrilcke commented 1 year ago

Ok, your example @DanilSko is not about grouping, but about a "group - individual" relation. This could be modelled the way @ingoboerner suggested.

The question of "what is a character" as raised by @lucagiovannini7 is in my opinion (we talked about that yesterday) nothing that has to be answered/solved during modelling a play in tei. if there is an "entity" speaking that could not be identified with another speaking "entity", than it has to be modelled as a individual "entity". note that not every "entity" has to be a character in an anthropological sense.

DanilSko commented 1 year ago

Thanks, Peer! Yes, I guess it could be modelled that way( so thanks @ingoboerner for the hint). But there are two things I do not yet understand:

Do I have to create some artificial group entity then? To link 'One of them' to this group?
Would that modelling by attaching them to a group also affect the network model (which I think is the main abstraction we use for analysis at the moment)?

This whole discussion basically originates from our (my and @lucagiovannini7's ) feeling that the current network models are sometimes affected too strongly by these non-characters. So for me a possible solution would be to be able to press some switch and somehow exclude such non-characters from the networks of character interaction if needed. Basically, it is a step towards having a less solid and more fuzzy & probabilistic representation of character network in a play, in which several configurations are available depending on the the conceptualization one chooses. A kind of a 'quantum dracor' in superposition 😉 which can be resolved according to researcher's will.

cmil commented 1 year ago

How about marking those elements in the particDesc that refer to somewhat ambiguous or makeshift characters with some sort of type attribute that allows them to be filtered out at some point when doing network analysis? Operationally that might be easier than modelling relationships to groups which in some cases may not be easy to define clearly.

DanilSko commented 1 year ago

How about marking those elements in the particDesc that refer to somewhat ambiguous or makeshift characters with some sort of type attribute that allows them to be filtered out

Yes, I'm very much in favour of this solution. I had something similar in mind. We could mark these 'voice-only' characters somehow, just like we can now identify these mythical creatures in GreekDraCor after they were linked to their wikidata id's in the ana attribute.

peertrilcke commented 1 year ago

I must confess, in some cases I don't see the problem.

Do I have to create some artificial group entity then?

Please don't do that.

Either you have a group performing a speech act, AND you have in addition "One of them" performing a speech act. Then you can mark both "Group" and "One of them" as speakers and you could store a relation in the xenoData.

Or you have only one speaker called "One of them". Then you mark only one speaker "one of them". And that's it. I see no reason to create an artificial group or something similar. That would be an inadmissible mixing of work on the structuring of text documents and a semantic interpretation of a fictional world from my point of view.

And if you want to add an additional typification of characters (or better say: speaking/appearing entities), do this by adding this information in an external file (your own annotation file) or at least by putting this information in xenoData. Please don't add it to particDesc.

And: Please don't think this too much from the network visualization and certainly not from the network visualization that appears in the current DraCor frontend. There are very different ways to model networks from the structural data in the TEIs and even more ways to visualize them on this basis. You definitely cannot take the current visualization, which is one of very many possibilities, as a basis for making decisions about working on the XML.

So for me a possible solution would be to be able to press some switch and somehow exclude such non-characters from the networks of character interaction if needed. Basically, it is a step towards having a less solid and more fuzzy & probabilistic representation of character network in a play, in which several configurations are available depending on the the conceptualization one chooses. A kind of a 'quantum dracor' in superposition 😉 which can be resolved according to researcher's will.

This is definitely a cool idea. I wouldn't implement it in the frontend, but maybe we'll make a notebook for it (as a kind of microservice). After Henny has implemented the API wrapper at the beginning of next year, we can think about whether this will be a next project.