FamilySearch / gedcomx

An open data model and an open serialization format for exchanging genealogical data.
http://www.gedcomx.org
Apache License 2.0
359 stars 67 forks source link

Can we have a patronymic name-part? #205

Closed nilsbrummond closed 12 years ago

nilsbrummond commented 12 years ago

Proposal:

Add a new NamePart known type of:

http://gedcomx.org/Patronymic

Support for pre-surname era of Scandinavia and other place using patronymic names. In the Scandinavian case the "last name" is in the form "<father's name's>-son" or "<father's name's>-daughter". Therefore unless the son has the same name as the father then the "last name" changes every generation.

References: http://en.wikipedia.org/wiki/Patronymic http://www.sorensenfamilyhistory.org/genealogy/danish_names_genealogy.htm

stoicflame commented 12 years ago

+1

jralls commented 12 years ago

+1

But see my comment in #161 about removing the lists of known Foo Types from the spec.

EssyGreen commented 12 years ago

I presume we'll also be supporting Matronymic too then pls?

nilsbrummond commented 12 years ago

But see my comment in #161 about removing the lists of known Foo Types from the spec.

Not a bad idea consider the number of name types potentially. examaple: pull request #201

Supporting all cultures could be complex...

Separated specs for type enumerations, that could be faster moving, could be a good thing.

nilsbrummond commented 12 years ago

I presume we'll also be supporting Matronymic too then pls?

+1

Yup I need these too. Less common for me but I have a few.

daveyse commented 12 years ago

To extend this concept further, should we then also include patronymic-surname and matronymic-surname types to separate the surname parts for Spanish and Portuguese names?

What about maiden-surname for females that use their maiden name as their "middle" name after marriage?

From a research perspective, it would be quite helpful to identify the etymology of a name part. However, if a name part can only have one "name part type" assigned to it, I worry about the impact of the _mis_use (not abuse) of these sub-types. I also am concerned about misinterpretation of automatic mappings in and out of systems that do not preserve these sub-types.

For instance, consider Jane Weston Johnson. The name Weston could be her

That name part can stored as a GIVEN name part in one system and a SURNAME in another system if these systems do not acknowledge the nuances of the sub-types.

Sorry I've rambled so much.

nilsbrummond commented 12 years ago

My main desire for the patronymic is to have a last name type that the software knows it does not make sense to group people by. A last name that changes every generation is just noise if put in a surname list.

To me the fact that a name-part is group-able, or not, is really what is important. There are a lot of cultural name rules and types out there. Enumerating then all is probably not feasible. perhaps a property set that can be applied to name-part-types when defined as needed.

EssyGreen commented 12 years ago

My main desire for the patronymic is to have a last name type that the software knows it does not make sense to group people by. A last name that changes every generation is just noise if put in a surname list.

Turning this around is it not more important to identify the name parts which can be sorted by ... in other words instead of going into the intricacies of patronymic, matronymic etc etc just have a flag attribute on the name part for "include in index" or some-such?

EssyGreen commented 12 years ago

Or thinking again maybe just have a text property of the Name (not Name part) which is simply "Index As"

thomast73 commented 12 years ago

Okay. It's my turn to jump in on this thread.

First, I believe the requirement for "name parts" comes from the fact that much of the genealogical data we deal with was collected via name part fields -- e.g., a death record with "Given name(s)" and "Surname" fields, or an Ancestry.com "Add..." form with "First and Middle Name", "Last Name" and "Suffix" fields -- and that it might be best to exchange that information in the form that it was collected rather than reformatting it. The reason the data was collected in parts was likely to facilitate sorting and searching by asking the informant to identify those parts, but sorting and searching are really outside the domain of data exchange and sort/search algorithm requirements should not be principle drivers in this conversation.

As I have considered the various ideas expressed here and elsewhere, I am starting to come to a idea that is different that any I have heard or seen expressed before. As Inigo Montoya says: "Let me 'splain."

It seems to me that it is useful to designate a name part as a "surname" even if it happens to be "patronymic" in construction. It is useful to designate a name part as a "given name" even if it happens to be the "religious" part of his given name. There seems to be a type hierarchy with the root of the hierarchy being very general with a desire to add a bit more detail.

Most applications define and use the notions represented by the most general designations (i.e., given name, surname). The additional qualifiers are rarer, but potentially useful. However, it does not seem to me that these qualifiers fit best as siblings to the most general designations. I do not think adding a "Patronymic" type as sibling to the "Surname" type is an improvement. Instead, it seems that "Patronymic" further describes or qualifies the type of "Surname" and that a name part could be both a surname and a patronymic name.

All of this suggests an implementation something like the following:

NamePart

NamePartType

NamePartTypeQualifiers

I have been somewhat arbitrary in my selection of qualifiers, but wanted to give a sense of the possibilities. I think there is quite a bit of value in thinking of things in this way. Most existing data could be represented without a qualifier, but as people and algorithms became more sophisticated, these qualifications could be added and used and shared.

So what do you think?

nilsbrummond commented 12 years ago

Seems a good approach. Along the lines of dublin core's dumb down principle..

The only problem I have is the specific case of the Patronymic. Assuming Patronymic would be a NamePartTypeQualifiers applied to a Surname field. By default a Surname is an inherited family name, and names are generally grouped by Surname in software. If you dumb down from Patronymic to Surname you have the problem that the name goes from being logically ungroupable to groupable.

People with the same Surname are generally related in the same family tree. People with the same Patronymic name are not generally related. I.E. please don't sort Patronymic names into the Surname list.

Perhaps modify as such:

NamePartType

stoicflame commented 12 years ago

People with the same Surname are generally related in the same family tree. People with the same Patronymic name are not generally related. I.E. please don't sort Patronymic names into the Surname list.

So I think your proposal is to rename Surname to be InheritedName? I'd have concerns with that because Surname is just so universally used. I understand that you can't assume a relationship between people with the same Patronymic name, but that doesn't mean it's not a surname, does it? I think you've got a valid concern about the behavior of products, but I think those concerns fall outside the scope of this project. We're just trying to provide a way to exchange these kinds of data, not define how products behave.

daveyse commented 12 years ago

Let me quote from that "universally accepted authority", <grin>, Wikipedia (Surname):

A surname is a name added to a given name and is part of a personal name. In many cases, a surname is a family name. Many dictionaries define "surname" as a synonym of "family name". ... The Icelandic system, formerly used in much of Scandinavia, does not use family names. A person's surname indicates the first name of the person's father (patronymic) or in some cases mother (matronymic).

[bold added]

Although one may usually consider a surname to be a family name, shared by the group members, it is not always the case. Patronymic and matronymic surnames are still surnames, and are used to affiliate the individual with a parent. It just so happens that their surnames are [usually] not the same.

I believe @thomast73 is proposing a very viable solution that addresses the aggregation of NamePartTypes which allows the generic grouping of Suname [in its various forms and meanings] and allows further clarification as to how that 'NamePart` may affiliate the individual with others of group.

From a potential processing perspective, this also has the benefit of providing "fuzzy logic" and probabilistic weight for those ambiguous Surname terms that may be interpreted as a Middle' or 'Family term in the GivenName by one contributor and as a Maiden term in the Surname by a different contributor. Acknowledged, processing is out of the scope of GEDCOM X's charter.

nilsbrummond commented 12 years ago

So I think your proposal is to rename Surname to be InheritedName?

Yes it was, as well as add "Other".

I'd have concerns with that because Surname is just so universally used. I understand that you can't assume a relationship between people with the same Patronymic name, but that doesn't mean it's not a surname, does it?

Okay, I just read http://en.wikipedia.org/wiki/Surname and now agree with you. In my mind a Surname was an inherited Family name. According to the wiki page a Surname is a generic last name, family name, Patronymic, etc.. category. So I was wrong.

I revoke my proposal for Inherited and Other NamePartTypes.

I think you've got a valid concern about the behavior of products, but I think those concerns fall outside the scope of this project. We're just trying to provide a way to exchange these kinds of data, not define how products behave.

I agree to some extent, as long as the data needed for a product to behave well is included. I think we are close now that I understand Surname better.

From http://en.wikipedia.org/wiki/Surname :

In Spain and in most Spanish-speaking countries, the practice is for people to have two surnames. Usually, the first surname comes from the father and the second from the mother...

So be sure multiple name-parts with the type=Surname and TypeQualifiers=Family are allowed. I don't think there are prohibited.

Just thinking we may need to cookbook or good set of examples included for names to insure good data exchange.

Edit: @Daveyse - Didn't see your post till after I wrote mine...

thomast73 commented 12 years ago

From http://en.wikipedia.org/wiki/Surname :

In Spain and in most Spanish-speaking countries, the practice is for people to have two surnames. Usually, the first surname comes from the father and the second from the mother...

So be sure multiple name-parts with the type=Surname and TypeQualifiers=Family are allowed. I don't think there are prohibited.

This brings up the one other question I have in my mind.

Given the name "Diego Acuña Y Romero" and a cultural context of "es-Latn", we might consider a couple of different configurations for modeling it.

Method 1 (would require a change to my initial proposal -- qualifier[0..*]):

I think something like Method 1 would be a pretty good rendition -- very few assumptions need to be made.

Method 2:

The problem with Method 2 is that without the second qualifier, I cannot identify which name came from the mother and which came from the father. I can (using the information specified by the cultural context) make an assumption about it and be right a high percentage of the time; but it might be better if we could be a bit more explicit.

Method 3:

In Method 3, I would like to establish a qualifier for NamePart2, but what should it be? It is a family name, but really it is two family names. Assigning a qualifier using the above list seems to be a bit more sketchy.

Thoughts?

thomast73 commented 12 years ago

Oops! I must have hit the wrong button. I did not intend to close this.

mikkelee commented 12 years ago

I like the Type/Qualifier idea a lot. Some thoughts on how I would categorize names from my own tree:

thomast73 commented 12 years ago
  • "Bette" Niels Christensen Dalsgaard (1831 - 1904):
    • "Bette" (Prefix, Nickname) or? "Bette Niels" (Given, Nickname) -- "bette" meaning "li'l", as his older brother had the same given name, they were named after their grandfathers.
    • Niels (Given).
    • Christensen (Surname, Patronymic) -- after his father Christen Nielsen Dahlsgaard.
    • Dalsgaard (Surname, Geographic) -- the farm he resided on; this is very common in older records, and many current family-surnames are derived from these.

How about

nilsbrummond commented 12 years ago

Method 1 - MODIFIED.

How about just delete NamePart3 as below, if unclassified? The full name text is still in the NameForm...

I think it would also be valid to include just a Surname parts, if so desired, to give minimal information for surname indexing. For the case where the researcher only cares about identifying the Surname.

thomast73 commented 12 years ago

How about just delete NamePart3 as below, if unclassified? The full name text is still in the NameForm...

I think it would also be valid to include just a Surname parts, if so desired, to give minimal information for surname indexing. For the case where the researcher only cares about identifying the Surname.

If you assumed that a full rendering of the name (with the 'Y' included) existed in the NameForm.fullText value, it would be acceptable to only include in parts the name parts that the application had classified and leave out any unclassified portions. But it may also be acceptable to build a parts list that includes all terms in the full rendering of the name, including unclassified parts. Or it would also be acceptable to express the name by only providing parts. It might not be such a good idea to only provide parts and leave out unclassified portions, however.

mikkelee commented 12 years ago

How about "Bette" (Prefix, Diminutive) "Niels" (Given)

Well, it's just like saying "Little Peter" and "Big Peter" to differentiate two men called Peter. "Diminutive" seems more fitting for qualifying "Betty" (as d. for "Elizabeth").

nilsbrummond commented 12 years ago

So rules for getting a full name would be as such?:

  1. If the NameForm.fullText field is not NULL then it is used.
  2. Else concatenate the NameParts in the order provided.

Only problem I see is multiple ways to present the same data. I personally prefer tightening the spec down to one way. Keeps thinks simple. Require NameForm.fullText to not be null is one way..

thomast73 commented 12 years ago

So rules for getting a full name would be as such?:

  1. If the NameForm.fullText field is not NULL then it is used.
  2. Else concatenate the NameParts in the order provided.

Only problem I see is multiple ways to present the same data. I personally prefer tightening the spec down to one way. Keeps thinks simple. Require NameForm.fullText to not be null is one way..

We are in the process of updating the specification. This comment would be better discussed on pull request #155. Would you please review the updates being proposed there and then re-comment as necessary? Thanks.

stoicflame commented 12 years ago

Please note that the proposal to address this request is attached to #155 and summarized by my comment. We'll leave the proposal uncommitted for a few days to allow people to comment before we merge.

thomast73 commented 12 years ago

Thanks, everyone, for your input on this issue. The changes we have made as a result of your feedback were merged into the master branch as part of pull request #155. We appreciate your time and thoughts and we feel that your feedback has helped to improve the GEDCOM X model.