chin-rcip / collections-model

Linked Open Data Development at the Canadian Heritage Information Network - Développement en données ouvertes et liées au Réseau canadien d'information sur le patrimoine
Creative Commons Zero v1.0 Universal
12 stars 1 forks source link

Is an appellation unique to its bearer? #25

Open stephenhart8 opened 4 years ago

stephenhart8 commented 4 years ago

There is two way of understanding appellations of actors:

  1. On the one hand, we could see the name of someone as unique to that person
  2. On the other hand, this name can be shared between homonyms, and is therefore not unique.

To exemplify this, we could have two homonyms, Person A, called John Doe and born in Montreal and Person B, also called John Doe and born in Ottawa. In the first conceptualization, the appellation of person A is unique to him, and therefore the URI of each appellation John Doe is unique (mic.ca/uri/appellation/1234 and mic.ca/uri/appellation/6573). In the second conceptualization, because person A and B are being called with the same first and last name, they both share the same appellation that has just one URI (mic.ca/uri/appellation/john_doe).

CiC_Issue25-example

What are the benefits of linking people by their name? I'm afraid that it would create meaningless links? What are the benefits of not linking people by their name?

KarineLeonardBrouillet commented 4 years ago

I think from a social standpoint we do share names. For example, even though parents individually name their children, first names have trends and are linked to socio-historical elements such as celebrities or rulers naming their own children, beloved characters in works of fiction being used as inspiration, etc. As such, humans are assigning names en masse in a way and this act is closely tied to their place in society.

Cognitively speaking, we use names uniquely I believe. We might know many Johns, but if we do not specify which one we are talking about what comes to mind is an aggregation of all the Johns we know, not a class of Johns with the single characteristic of sharing a name.

So generally speaking both conceptualizations are accurate with the individualized URI being closer to cognitive reality and the shared one closer to social reality. I am not sure what would be the implications of each pattern from an efficiency point of view, however?

All this is to give the background to my view of this: if linking people by their name is mostly a social evaluation of trends and patterns, it seems more statistical in nature than semantic. Moreover, the semantics are more closely resembling cognition than social patterns I believe (from a functional standpoint). As such, and I have very limited knowledge on this, it would seem to me that using individualized URIs would be closer to how we function in everyday life. In other words, the absence of a link between two Johns illustrates how they do not put forth an "intentional sharing" of their name but an incidental one.

stephenhart8 commented 4 years ago

By discussing with Ludovic, we realized that the E41 Appellation of an Actor receives the specific E55 Type of being preferred or non-preferred. Therefor, this E41 Appellation has to be specific to one actor, as the same name could be preferred for Person A and non-preferred for Person B. If I'm not mistaken, that means that we have to go with option 1.

Habennin commented 4 years ago

interesting discussion. The correct pattern is 2. There would be no point to generate an appellation node if you did not say that one name was one name. The point of an appellation node is to identify the name in itself independent of the entity it names. If you land on a name appellation node/record then you want to know the things that it names. You would lose that functionality by making my 'George Bruseker' different than that of the other 'George Bruseker'. It is true that you put a preferred identifier type on a name sometimes. Since the data will be loaded in a named graph you can see whose preference via the named graph. To be even more specific you should even create an attribution node so that you can say who says it is preferred. If you are not interested in the name itself then you can just put an rdfs label on the actor/object itself and save yourself trouble (though I wouldn't recommend it).

stephenhart8 commented 4 years ago

@Habennin indeed you are quite right, having a separate appellation for each actor is anti-lod logic. I think both solutions, the named graph and the E13 Attribute assignment, should be tested, to see which one is easier to implement and gives the best answers to requests.

We would still need to be careful, as some names could have different parts. In the case of @KarineLeonardBrouillet "Léonard Brouillet" is her last name, but another Karine Léonard Brouillet could have "Léonard" as a middle name, which means it would be two separate appellations.

Habennin commented 4 years ago

I think you'll find that in a lot of data about names and identifiers you have specific documentation of who said it within your source data. In that case it is a case of 'quoted speech', so you would most easily use E13

stephenhart8 commented 4 years ago

From the discussion during the meeting of the 20th of December, we've decided to chose the proposition 1 for the TM 2.0, where the appellation of an actor is unique to it bearer, as it seems easier to manage at first. But we will still leave this discussion open and think more about this issue for the next version of the TM.

illip commented 4 years ago

I read the thread once again more carefully and I would like to clarify something. For the moment, we keep option 1 because it was easier to implement according to our delay. That said, I highly recommend to go with option 2 for the next version. Some reasons:

  1. We want to describe the appellation. At least, we need to state the preference and the type (e.g. "preferred" and "last name"). So the rdfs:label is clearly not enough here.

  2. In my opinion, the social reality vs cognitive reality question raised by @KarineLeonardBrouillet is really interesting and I would probably argue that social reality "includes" in a way the cognitive reality. Even if someone refers to a name to designate a unique person and he is not aware of all the social construct behind it; the social reality is still valid. So your parents can name you "John" just because they like this name but there is still a social construct behind it and I think it's mandatory to keep this visible in our model.

  3. For the appellation type that can be confusing, like "Léonard", I think our model is able to handle the different options. Also, we could reconcile those names with other controlled vocabularies. For instance, Wikidata offers two different URIs for "Léonard": male given name and family name.

  4. I also agree with @Habennin, we should see the graph as the "validity context" of the embedded data. So the "preferred" type is in the context of the graph (the institution) and not the whole universe. I think we should also use E13_Attribute Assignment since it doesn't seem so easy to retrieve data at the graph level (see #38)

VladimirAlexiev commented 4 years ago

So I think option 1 is better for CIC

illip commented 4 years ago

Thanks @VladimirAlexiev,

For the moment, our model allows to breakdown an appellation in different parts. However, the part type is managed using a vocabulary. Thank you for the advice about first/last name, we will take care of recommending a proper set of terms.

For the moment we will go with option 1. I don't think we are going against the scope note of E41_Appellation by doing so. However, since @Habennin has mentioned that we should go with option 2, this is something we will explore in the next version.

VladimirAlexiev commented 4 years ago

I also think option 1 doesn't make E41_Appellation useless and its use pointless. You often need to record title details (eg official vs nickname, first vs last name, preferred vs non-preferred name) and you need a node for that.

I believe E13 and Named graphs are a distraction in this case because often preferred/non-preferred is "universal knowledge" and you don't know or need to record who expressed that preference. And just because a museum recorded it, doesn't necessarily mean they prefer it. I believe this is unlike AAT label props contributor, contributorPreferred, contributorNonPreferred etc because those are used for concepts having many terms, and often the preference of one or another term is a matter of historic tradition or "personal preference". I believe it's not so about person names.

To record more details about an Appellation, you can use frbroo's Name Use class.

stephenhart8 commented 3 years ago

By reading this issue again prior to the Semantic Committee meeting, it seems that the shared appellation is the most semantic. But two issues seems of importance:

What to do with the Preference Type?

As stated above, some appellations are preferred to some actors, and some are non preferred. The preference of an appellation is directly linked to the appellation in the TM 2.0.

If the appellations are shared by actors, that mean that some appellations would have multiple preference types. How to differenciate to which actor the preference type is linked to?

Let's take an example to illustrate that. Actor number 64862 is called John Doe. This appellation is therefore the preferred appellation for that person. Actor number 11603 is called John William Doe, but can some times be refered to as John Doe. Therefore, the appellation John Doe is a non-preferred one for that other person.

In this example, the appellation John Doe will have two E55 Type linked to it, one is preferred, the other one non-preferred.

The question is: How to specify that the Preferred Type is for the Person 64862 and the Non-Preferred Type is for Person 11603?

I see two solutions for that:

Another solution, of course, would be to NOT share the appellations between actors.

Link the Name Use Activity to the Actor

In the TM 2.0, the F52 Name Use Activity is linked to the appellation with the property R64 used name, but there is no link between the F52 Name Use Activity and the E39 Actor, as you can see here: 020_Pattern_IdentifiersAppellations_p

If we use shared appellation between actors, there is nothing that links the E39 Actor to its F52 Name Use Activity (which is not shared). It seems that the property R63 named between the the two entities is necessary. The pattern would therefore be the following: Issue_GitHub_Appellation-Page-1

KarineLeonardBrouillet commented 3 years ago

After discussing this with members of CHIN's semantic committee, we have come to the conclusion that there is no immediate and foreseeable use to the use of a "shared appellation" approach to the application of the appellation. That said, it might be relevant in the future for certain disciplines such as genealogy. For the moment, CHIN will continue implementing the "individual appellation" approach, although this position might be reassessed if relevant use cases establish the need to do so.

VladimirAlexiev commented 3 years ago

@KarineLeonardBrouillet glad you took that decision!