FamilySearch / GEDCOM

Apache License 2.0
160 stars 20 forks source link

Add something for Dutch "Roepnaam" (CALLNAME) #473

Open mother10 opened 3 months ago

mother10 commented 3 months ago

As explained here: https://www.webtrees.net/index.php/forum/2-open-discussion/37395-gedcom-7-personal-name-structure by JustCarmen, Dutch people have trouble entering "Roepnaam". That certainly is not a Nickname, but indeed a "Callname", so a name by which someone is known, is called so to speak in daily life.

I think "Roepnaam" is only used in the Netherlands, but I am not sure of that.

As stated in that link, and I will use myself as example now, at birth I was given the Firstnames "Catharina, Johanna, Elisabeth", and together with that my parents gave me the "Roepnaam": "Tineke". So everyone knows me as Tineke and those 3 "Firstnames" are only used on official records.

So GIVN is "Catharina, Johanna, Elisabeth", NOT Tineke. But Tineke is NOT my Nickname, my Nickname is Mother10.

As JustCarmen said in that link, maybe there should be a CALL for these cases.

Or maybe another NameType. In that case it might be possible to have that in the same minor as the other NameTypes.

Hope I made myself clear.

Norwegian-Sardines commented 3 months ago

@Norwegian-Sardines I think I understand what you say and I can agree with the fact that big changes are not to be done now..

Did I understand correctly that you would agree with a NAME.TYPE DAILYNAME ? That would be fine with me, as I already said in the post just before yours. And if that would indeed be DAILYNAME it could be used for others too like RUFNAME, thats what I think.

I use the term “preferred name”. For me “Daily Name” has no meaning without a written definition, but that could be just a difference in our use of language, just like nickname never includes screen names like “mother10”.

tychonievich commented 3 months ago

Sorry to have been away for several days.

I was not trying to say that we need to limit GEDCOM to ideas with English terms. Rather, I was trying to observe that NICK is very broadly defined: it's definition hinges on the English word "nickname" which includes many concepts, including sobriquet (an affectionate, humours, or fanciful nickname), epithets (descriptive or insulting nicknames), diminuatives (nicknames derived from names, often via a culturally-predictable child-ifying process), and many others. The uniting principle of nickname is that it is not the official, legal name.

That said, I'm convinced by the arguments about adding more name types instead of (or in addition to) more name part types. Here's an initial stab at some NAME.TYPEs to add; two from this conversation and two from GEDCOM-X. Using the following prefixes

Short Prefix URI Prefix
g7 https://gedcom.io/terms/v7/
gx http://gedcomx.org/
ext some other prefix if an extension, or g7 if standardized

I propose the following additions:

Value Meaning
gx:Familiar A name used (primarily) by people familiar with the named individual, not used in official or formal situations.
gx:Religious A name given for religious purposes, usually as part of a religious ceremony or rite.
ext:Roepnaam A name given in Holland at birth for daily use in all circumstances except those requiring a formal legal name.
ext:Legal A name recognized by a government and used in official government documents.

Any others we should add? And changes we should make to these definitions?

I also think we want to extend NAME.TYPE to have {0:M}, not {0:1} cardinality.

Norwegian-Sardines commented 3 months ago

As I indicated above, I see the following name types; “preferred name”, “nick name”, “Roepnaam”, “Rufname”, “farm”, “patronymic”, “generational”.

tychonievich commented 3 months ago

I thought of several of those (perhaps incorrectly) as name part types, not name types. Rufname is a good example; as I understand it, it is required by law to be just one of several parts in a name. Using the example from Wikipedia, we'd have

and saying her rufname is "Emmy Noether" is incorrect. But I could be wrong; if someone with more German cultural awareness than me wants to comment, I welcome that.

Similarly, I'd assume that farm, patronymic, and generational (and possibly preferred?) apply to name parts, not entire names.

One option for handling this with NAME.TYPE would be to have a type that means "The GIVN in this name is a rufname". We could do that, but I wonder if a better approach is to and a NAME.part.TYPE enumerated field instead, where we can mark values like rufname, preferred, primary, and patronymic. For example,

1 NAME Amalie Emmy /Noether/
2 GIVN Amalie
2 GIVN Emmy
3 _TYPE _RUFNAME
2 SURN Noether
3 _TYPE _FAMILY

or

1 NAME Catharina Johanna Elisabeth /Xxx/
2 GIVN Catharina Johanna Elisabeth
2 SURN Xxx
2 TYPE BIRTH
2 _ANOTHER_TYPE _LEGAL
1 NAME Tineke /Xxx/
2 GIVN Tineke
3 _TYPE _ROEPNAAM
2 SURN Xxx
2 TYPE _PREFERRED
Norwegian-Sardines commented 3 months ago

My TYPE solutions are probably not the best, however... Your solution depends on the receiving program to read and use the NAME subtags as well as the custom _TYPE tag and that a tag exists for some part.

So here is a set of real names from my database to work with:

The fist individual could be entered in the database as:

1 NAME Olaf
2 TYPE BIRTH
2 GIVN Olaf
2 NSFX Olssen
3 _TYPE PATRONYMIC
2 NSFX Hafstad
3 _TYPE FARM

Name indexing would have to be expanded to the NSFX tag! And the "Display Name" would have to be built some how!

The second as:

1 NAME  Ludvig Andreas /Feyer/
2 TYPE BIRTH
2 GIVN Ludvig 
2 GIVN Andreas 
3 _TYPE PREFERRED       (could also be "RUFNAME")
2 SURN Feyer

the "Display Name" would have to be built to show the preferred name as underlined some how! In the software I use the "preferred name" in the NAME tag is followed by an asterisk, Ludvig Andreas* /Feyer/, therefore a display value can be prroduced.

The third as:

1 NAME  Christian Jon-Erik /Feyer/
2 TYPE BIRTH
2 GIVN Christian 
2 GIVN Jon-Erik
2 ???? Erik
3 _TYPE PREFERRED
2 SURN Feyer

Some how we would need to display "Erik" as his preferred name. This could be done with a second NAME statement.

1 NAME Erik /Feyer/
2 TYPE AKA                     <== I'd rather user "PREFFERED" here

In Chinese and Korean names they use a "generational name" that is part of their "given name" in their Western" versions (they have different characters for the two parts. We have been asked in software about where to put their generational names without a solid answer.

Li Wenfeng where Li is the family name, Wen is the generational value of their given name Wenfeng

In many naming customs individuals can have multiple surname parts. For example: José Santos Almeida

1 NAME José /Santos/ /Almeida/
2 TYPE BIRTH
2 GIVN José
2 SURN Santos
3 _TYPE MATERNAL 
2 SURN Almeida
3 _TYPE PATERNAL

Some individuals have place name (these are like my farm name example), for example: Leonardo di ser Piero da Vinci

1 NAME Leonardo di ser Piero da Vinci
2 TYPE BIRTH
2 GIVN Leonardo 
2 NSFX Piero
3 _TYPE PATRONYMIC
2 NSFX Vinci
3 _TYPE LOCATION
mother10 commented 3 months ago

First: What happens to the proposed Name Types of #353 ?

Any others we should add? And changes we should make to these definitions?

I think thats what I just wrote. So DIVORCED, ESTATE (or FARM), RELIGIOUS, UNIFIED, VARIANT, and the others @Norwegian-Sardines mentioned.

And extending Name Types, yes, I think that should be {0:M}, not {0:1} cardinality.

@tychonievich I get a bit confused about the _TYPE , the underscore I mean. Official GEDCOM should that have tags defined with underscore or is that just because there is no name for it yet?

Maybe name that other type EXTYPE in stead of _TYPE ? Where EX stands for Extended? Or maybe TYPEX ?

But when I look at that it is confusing. We have a TYPE in a TYPE. English is not my mother language so I have to guess a bit, what about USED or INTERPRET or CONSIDER or VIEW (or another synonym), because that is what is done with the name. It is used in certain cases, or meant for certain cases. Would that be better? Its less confusing. But you might know better words for it.

@Norwegian-Sardines

1 NAME Olaf 2 TYPE BIRTH 2 GIVN Olaf 2 NSFX Olssen 3 _TYPE PATRONYMIC 2 NSFX Hafstad 3 _TYPE FARM Name indexing would have to be expanded to the NSFX tag! And the "Display Name" would have to be built some how!

Maybe: 1 NAME Olaf 2 TYPE BIRTH 2 GIVN Olaf 2 NSFX Olssen 3 _TYPE 1, PATRONYMIC 2 NSFX Hafstad 3 _TYPE 2, FARM Where 1 and 2 denote the position in the total name?

And the underlined name maybe not as GIVN Andreas But as GIVN Andreas* So that it would be possible to give the star in the GIVN tag?

albertemmerich commented 3 months ago

I thought of several of those (perhaps incorrectly) as name part types, not name types. Rufname is a good example; as I understand it, it is required by law to be just one of several parts in a name. Using the example from Wikipedia, we'd have

name Amalie Emmy /Noether/
Vorname (GIVN) Amalie
Vorname and Rufname Emmy
Nachname (SURN) Noether

and saying her rufname is "Emmy Noether" is incorrect. But I could be wrong; if someone with more German cultural awareness than me wants to comment, I welcome that.

Correct. In this example Rufname is "Emmy".

Until the 1960ies in Germany the Rufname was underlined in the official birth records. By that it must be one of the given names, and it could not be modified later. For this situation Gedcom-L group defined the _RUFNAME tag, the payload of it only showing the Rufname:

2 _RUFNAME Emmy

I agree with Luther, that in future GEDCOM spec the Rufname should be marked as name part type, i.e.

2 GIVN Amalie
2 GIVN Emmy
3 TYPE XXXX

Whether XXXX will be RUFNAME or we have more situations matching the conditions:

and find a new type XXXX (including RUFNAME) for this, we have to check.

Until 31 OCT 2015 in Germany the Rufname was documented when registering your residence. In case you had more than one given name, however none was marked as Rufname in your birth record, you could choose which given name should be the Rufname. It was possible the choose another Rufname at next registering. So "Rufname" at registering is not the same as Rufname at birth, and it became a preferred given name, no longer unique in whole life. For this the _RUFNAME tag of GEDCOM-L is not the correct way to document it.

Then German law again was modified, there is no Rufname data field at registering any more, however a "gebräuchlicher Vorname". Again it must be one of the given names shown in birth record, however you can choose which one. So it is the quite the same as before, however called "gebräuchlicher Vorname" and no longer "Rufname". I think this is what was discussed as preferred type:

As it must be one of the given names, it again differs from Roepnaam in Netherlands:

So far we have 3 different situations which should be clearly separated in future GEDCOM specs.

tychonievich commented 3 months ago

Thanks, @mother10, for the reminder about #353 and #335. I thought we had discussed more name types in the past, but forgot where they had been documented.

I used _TYPE instead of TYPE because I was hoping we could arrive at an extension we could register with 7.0 prior to the release of 7.1. An extension allows it to be used in practice by those wishing to experiment, tried out to see if it works, prior to being standardized and given a non-underscore tag in 7.1. But you're right, that's probably distracting during discussion.

Thanks @Norwegian-Sardines, @mother10, and @albertemmerich for the additional real examples and discussions of them.


Trying to summarize the general observations in this thread:

  1. A person can have any number of names
  2. Each name can have any number of name types
  3. Each name consists of one or more name parts
  4. Each name part can have any number of name part types
  5. Name parts have (in general) two orders: spoken order like "Luther Tychonievich" and indexing order like "Tychonievich, Luther." In some cultures these are the same. I'm not aware of a third order, but would not be surprised to discover it. I'm also not aware of cultures with unordered name parts, but again would not be surprised to discover it.
  6. Many names have a hierarchy of part omissions and shortening rules. For example, I have one name (Luther Allen Tychonievich) which in professional communications is progressively shortened to "Luther A. Tychonievich", "Luther Tychonievich", "L. Tychonievich", "Tychonievich"; and in one source I was "L. A. Tychonievich" because I was co-authoring a paper with my father, another L. Tychonievich. In social communications the same name is progressively shortened to "Luther Tychonievich", "Luther." In some cases these forms may not be readily derived from the list of parts alone.
  7. Some types can apply to either a name or a name part; others have narrower definition and only apply to one of the two.
  8. Types have a semantic hierarchy: both rufname and roepnaam are (distinct) subtypes of a given name, for example. Albert's last post suggests rufname might be a subtype of preferred given name, itself a subtype of given name, and other multi-layer types seems likely to recur in other settings too.
  9. Some multiple-type situations are outside the hierarchy; for example, some professional/stage/pen names are also familiar/affectionate/nicknames, while others are not.

What have I missed or mischaracterized?


To try to turn this into a future-GEDCOM-version proposal, would the following work?

n @XREF:INDI@ INDI              {1:1}
  +1 NAME                       {0:M}
     +2 TYPE <Enum>             {0:M}
     +2 PART <Text>             {0:M}
        +3 TYPE <Enum>          {0:M}
     +2 FORM <Text>             {1:M}
        +3 TYPE <Enum>          {0:M}
        +3 DATE <DateValue>     {0:M}
        +2 <<SOURCE_CITATION>>  {0:M}
     +2 DATE <DateValue>        {0:M}
     +2 <<SOURCE_CITATION>>     {0:M}

For example

1 NAME
2 TYPE BIRTH
2 TYPE LEGAL
2 PART Luther
3 TYPE GIVN
3 TYPE PRIMARY
2 PART Allen
3 TYPE GIVN
3 TYPE SECONDARY
2 PART Tychonievich
3 TYPE SURN
3 TYPE FAMILY
2 FORM Luther Tychonievich
3 TYPE PREFERRED
2 FORM Luther Allen Tychonievich
3 TYPE FULL
2 FORM Luther A. Tychonievich
2 FORM L. A. Tychonievich
3 SOUR @S1@
3 DATE 2009
2 FORM Luther
3 TYPE FAMILIAR
2 FORM Tychonievich
3 TYPE FORMAL
2 FORM Tychonievich, Luther Allen
3 TYPE INDEXING

Some notes:

I didn't include event-based names because I haven't thought through how to do that yet.

I've no-doubt missed many things; I welcome corrections, counter-proposals, etc.

Norwegian-Sardines commented 3 months ago

@tychonievich I'm still digesting your comment above.

I like the idea of creating a "generic part" then telling the reader what that Part is by using a "type". This reduces the number of cultural specific terms and additions. For example in Korean the term Surname is not used or understood, the term "Family Name" is more appropriate.

How does this proposal provide the system with a display value? The display value would be used in reports, charts, website page headers and finding aids. Would the NAME.TYPE include a value of "primary", or "display" or some other key to this name use for displays, or would it just being first in the list many NAME tag suggest this? How does (Can) the FORM tag get used for display, and if there are multiple FORM tags with in the NAME tag is their an rule or interpretation to note?

Norwegian-Sardines commented 3 months ago

@tychonievich

Also, How do you propose to support multiple surnames for indexing and display, or non name parts between indexed name parts or for display?

For example:

tychonievich commented 3 months ago

@Norwegian-Sardines wrote

How does this proposal provide the system with a display value? The display value would be used in reports, charts, website page headers and finding aids. Would the NAME.TYPE include a value of "primary", or "display" or some other key to this name use for displays, or would it just being first in the list many NAME tag suggest this?

I'm open to either of these solutions. Given the generic rule from 1.2

The order of substructures of a single type indicates user preference, with the first substructure being the most-preferred value, unless a different meaning is explicitly indicated in the structure’s definition.

That suggests that the display would be the first (most user-preferred) FORM of the first (most user-preferred) NAME. We could also use order for some other purpose and define a DISPLAY type, or even a per-user PREFERRED_BY @XREF:SUBM>@, but I tend to side with the simplest order-based rule unless I'm missing a case why something else would be better.

How does (Can) the FORM tag get used for display, and if there are multiple FORM tags with in the NAME tag is their an rule or interpretation to note?

My thought was just "my name has more forms than I can derive from a simple rule, so why not just list them all?" Not sure it was the right thought. Did you have rules or interpretations in mind?

tychonievich commented 3 months ago

Leonardo di ser Piero da Vinci (No surname) Should be indexed with the Piero family and Vinci location, but needs to have a name part for "di" (of) and "ser" (an title for Piero)

I think the following would be sufficient for the cases you mention:

1 NAME
2 FORM Leonardo di ser Piero da Vinci
2 PART Piero
3 TYPE FAMILY
2 PART Vinci
3 TYPE LOCATION

If we want all parts in the form you listed to have PARTs and be in that order, we could augment this with

1 NAME
2 FORM Leonardo di ser Piero da Vinci
2 PART Leonardo
3 TYPE GIVN
2 PART di ser
3 TYPE PARTICLE
2 PART Piero
3 TYPE FAMILY
2 PART da
3 TYPE PARTICLE
2 PART Vinci
3 TYPE LOCATION

I'm not sure if there should be a 2 PART di ser Piero as well or not, or otherwise have a way of indicating that those particles attach to Piero and not to different name parts.

José Santos Almeida (two surnames) should be indexed with the Santos family and the Almeida family

I think this would capture that name (including my understanding of Spanish family name traditions)

1 NAME
2 PART José
2 PART Santos
3 TYPE FAMILY
3 TYPE PATRILINEAL
2 PART Almeida
3 TYPE FAMILY
3 TYPE MOTHERS_PATRILINEAL
2 FORM José Santos Almeida

If we defined PATRILINEAL as a subtype of FAMILY we could omit the TYPE FAMILY lines, but my gut is to handle subtypes with multiple TYPEs instead of with a documented type hierarchy, mostly because I think the multiple types will be less work for software implementers to handle.

mother10 commented 3 months ago

A lot to think about. :)

You now have FORM {1:M} We describe individuals for a tree, but often we dont know right away the name of someone, should we have:

1 NAME 2 FORM xxx 3 TYPE UNKNOWN

Where xxx is maybe empty? For those Indi's we have no name for yet?

I like the FORM and PART. But how to deal with prefixes and suffixes.

Changed the depts and put DATE where it belongs, at the "event" BIRTH or ADOP etc.

1 NAME 2 TYPE BIRTH, LEGAL 3 DATE 2005 3 PART Luther 4 TYPE GIVN, PRIMARTY 3 PART Allen 4 TYPE GIVN, SECONDARY 3 PART Tychonievich 4 TYPE SURN, FAMILY 3 FORM Luther Tychonievich 4 TYPE PREFERRED 3 FORM Luther Allen Tychonievich 4 TYPE FULL 3 FORM Luther A. Tychonievich 3 FORM L. A. Tychonievich 4 SOUR @S1@ 3 FORM Luther 4 TYPE FAMILIAR 3 FORM Tychonievich 4 TYPE FORMAL 4 FORM Tychonievich, Luther Allen 5 TYPE INDEXING 2 TYPE ADOP, LEGAL 3 DATE 2004 etc.

Would this look like soemthing?

tychonievich commented 3 months ago

You now have FORM {1:M} We describe individuals for a tree, but often we dont know right away the name of someone, should we have:

1 NAME 2 FORM xxx 3 TYPE UNKNOWN

Where xxx is maybe empty? For those Indi's we have no name for yet?

NAME is still {0:M} so I'd just not include the NAME if I didn't know it.

That said, I can see cases for a NAME with no FORM:

My gut is that a name with PARTs but not FORM should be given a FORM anyway to prevent the software from having to (incorrectly) guess it; maybe add a FORM.TYPE GUESS for name forms that were guessed, not found in a source. For a name with a TYPE but no FORM or PART, I think a NOTE may be a better place to share the information. But I've not spent much time thinking about this, so I may well change my mind.

tychonievich commented 3 months ago

But how to deal with prefixes and suffixes.

I think they're just more name parts, right?

1 NAME
2 FORM Luther Tychonievich
2 FORM Dr. Luther Tychonievich, Ph.D.
2 PART Dr.
3 TYPE PREFIX, HONORIFIC
2 PART Luther
3 TYPE GIVN, PRIMARY
2 PART Tychonievich
3 TYPE SURN, FAMILY, PATRILINEAL
2 PART Ph.D.
3 TYPE SUFFIX, DEGREE
mother10 commented 3 months ago

@tychonievich see you liked my combining the types thing. :) You are right those are PARTs too.

BTW how do you get that coloring? I looked but could not find it

mother10 commented 3 months ago

About DATE. I would say DATE belongs to the TYPE's that are in fact "events".

Like BIRTH, ADOP, FARM etcetra. Thats why I put it below TYPE.

tychonievich commented 3 months ago

Changed the depts and put DATE where it belongs, at the "event" BIRTH or ADOP etc.

I like your combining the TYPEs in a list instead of several structures, at least for readability.

A date might be associated with an event, but it might not. To my knowledge, my name appeared in print the form "L. A. Tychonievich" only once (before this github comment thread) in a paper I published in 2009; but perhaps the source's date is sufficient in that case? But I also know people who used one form of their name for a time, then switched to another form later, without any associated events.

BTW how do you get that coloring? I looked but could not find it

Preface the block with ```gedcom and follow it by ```

mother10 commented 3 months ago

There is not always a source, so in that case it is not possible to have a date either. So i think (certain) type's can have a date. And in the case you mention, that someone changes his name, or used another FORM, there should be e type to denote that and that type can have a Date added. There were more people asking for a date, so if all this is changed i would say, also give the possibility to have a separate date too. (as {0:1} ) Make it flexible.

mother10 commented 3 months ago

Order of (sub)structures: Here I think of what @tychonievich introduced in a recent post about using PART and FORM, where things like SPFX and NSFX etcetera were removed. I would say if a user wants a preferred sequence, have the sequence numbered in some way. Do not rely on the sequnce in the GEDCOM file itself. Possibly numbering parts, also fixes the problem of cultures, where name parts have another sequence than what we are used to here.

Now this is a really long thread so far, so I thought to take a step back again, and go over it and put all NAME Type's we discussed or mentioned so far in a list. That gives us the possibility to see which ones we should keep. So first the list I came up with:


GEDCOM 7.0.14

AKA Also known as, alias, etc. BIRTH Name given at or near birth. IMMIGRANT Name assumed at the time of immigration. MAIDEN Maiden name, name before first marriage. MARRIED Married name, assumed as part of marriage. PROFESSIONAL Name used professionally (pen, screen, stage name). OTHER A value not listed here; should have a PHRASE substructure #335 ADOPTED (could not find description) #353 DIVORCED (Name after a divorce) ESTATE (House name, farm name, name after moving into or marrying into a house/farm) RELIGIOUS ( Religious name, name adopted after joining a religious order) UNIFIED (Unified spelling for a family name) VARIANT (Different spelling for a name, also spellings based on other languages such as Latin, French) #473 (this one) DAILYUSE LEGAL OFFICIAL INFORMAL FAMILIAR FORMAL DAILY CASUAL ROEPNAAM RUFNAME DAILYNAME ??? are these for NAME? FIRMUNG KONFIRMATION ??? PREFERRED NICK ROEPNAAM RUFNAME FARM PATRONYMIC GENERATIONAL FAMILIAR RELIGIOUS ROEPNAAM LEGAL BIRTH PATRONYMIC FARM PREFERRED AKA MATERNAL PATERNAL LOCATION (This could also be used for people who move to another country and change their name accordingly) -------------------------------------- When I look at this list, and think about adding DATE, you could define 2 TYPE enumerations maybe, one that would allow a DATE for that also, so in fact TYPE's that are "events", and a enumerationlist with type's that do not have a date. **"Event" Name TYPE's could be:** ADOPTED BIRTH DIVORCED ESTATE (or is this the same as FARM?) FARM IMMIGRANT LOCATION MAIDEN (after a divorce you can go back to your maiden name) MARRIED RELIGIOUS **"Normal" Name TYPE's:** AKA Also known as, alias, etc. CASUAL DAILYUSE / DAILYNAME / DAILY FAMILIAR FORMAL GENERATIONAL INFORMAL LEGAL LOCATION MATERNAL NICK OFFICIAL OTHER A value not listed here; should have a PHRASE substructure PATERNAL PREFERRED PROFESSIONAL Name used professionally (pen, screen, stage name). ROEPNAAM RUFNAME I left out PATRONYMIC, because i think thats covered by Paternal and Maternal. Are these what is needed? Any missing? Would it be wise not to have a 7.1 but directly an 8.0 instead, because of the need for important changes? So only a 7.0.15 before 8.0?
Norwegian-Sardines commented 3 months ago

I think that “LOCATION” should not include the concept of name when they moved to a new place! This is a different use and is covered by “IMMIGRANT”.

I also believe that PATRONYMIC is a little different than PATERNAL. A patronymic name would be Olssen, Olafsdotter made from the father’s given name plus a gender based suffix and is used as a personal identifier, while a paternal/maternal surname would take the father’s/mother’s surname and use it as the individual’s actual surname.

mother10 commented 2 months ago

@Norwegian-Sardines Ok I agree with both.

tychonievich commented 1 month ago

Accidentally closed based on one quite-small PR

jkr-wrk commented 2 weeks ago

I tried to follow the discussion but might have skipped a few posts. As a Dutch myself I understand that Mother10 sees her 'online' name as a nickname. And her day-to-day name Tineke as a Roepnaam or Call-name.

But what the label is, is probably not the main issue. It's what we want to achieve. I tried to find a good example in English but it seems to be very uncommon.

Let's take both Richard Herbert Cheney and James Paul McCartney as example and combine the case. This is very common in Dutch. So let's assume the president is called Herbert Richard Cheney and the whole world knows him as Dick Cheney. Even his mother instructed everybody to call him Dick. Post would be send to Dick Cheney etc. Only his passport and bank-account would mention Herbert Richard or H.R. Cheney. But never Herbert Cheney, Herbert R. Cheney or H. Cheney.

In Dutch trees it is common to show his name in the textual representation as Herbert Richard (Dick) Cheney, an alternative could be Dick Cheney, Herbert Richard Cheney or Dick (Herbert Richard) Cheney. Indicating his official name and his day-to-day name. But this is way to large to show in a graphical representation. There it would be common to just use just Dick Cheney. Writing down H.R. Cheney or Herbert Cheney would confuse everybody. This is why NICK would contain the call-by name in a Dutch GEDCOM.

Then take the case of Elton John. A textual representation would show something like: Elton John (born as Reginald Kenneth Dwight) or Reginald Kenneth Dwight (better known as Elton John). The graphical representation would probably show Elton John or Reggie Dwight, probably depending if the audience is the public or his family. I mention this because this case could overlap with Mother10 using that name as Nickname, but I would consider it more of a Stage-name.

jkr-wrk commented 1 week ago

Adding to this: https://www.dutchgenealogy.nl/no-middle-names/

It's was common to name someone with 3 or 4 names, pointing to multiple important family members. Could be in order of importance or not. Cornelia Wilhelmina Aleida Caterina Voetjes. Everyone would call her Mien (derived from Wilhelmina), because the parents like that name, or later Tina (derived from Caterina) because the child does not like to be called Mien Voetjes (translated to My Feet). We even see official documents where the birth certificate says Cornelia Wilhelmina Aleida Caterina Voetjes but the birth certificate of the child would call the mother Tina Voetjes.

We refer to Cornelia Wilhelmina Aleida Caterina as Doopnamen -> Baptist names (sometimes even if children aren't baptized) or Volledige voornamen -> Full first names And Mien as Roepnaam -> Name to address you in day to day conversation

A Catholics could call there son: Otto Diederik Maria van Amsterdam (Rick) With the Maria part indicating to being Catholic, the Otto pointing to fathers father and the Diederik part pointing to mothers father. But it would be silly to call him Diederik Otto Maria -> D.O.M. because that would be "dom" (stupid).

It is also possible to call someone: Jan-Willem Gerrit van Amsterdam -> would be called Jan-Willem (the dash indicating it is one name, most of the time)

Important to indicate that we don't register middle names. Everything is part of the first name, just not the Roepnaam.

And just a last step. parts like "van", "van de", "de", "het" are part of the last name, but are "tussenvoegsels" and are not part of the sorting order. When sorting we would write: [Amsterdam, Jan-Willem Gerrit van] but when listing the last name it would be [van Amsterdam] or [Amsterdam, van]

But all-in-all it is still the same as Herbert Richard (Dick) Cheney. With the only difference that it is very uncommon in English to have the Nickname derived from the second name. And it is hard to register the Full Name and the Nickname and display both in the correct order.

FIRSTNAMES: Cornelia Wilhelmina MIDDLENAMES: Aleida Caterina NICKNAME: Mien LASTNAMEPREFIX: van LASTNAMES: Amsterdam TEXTDISPLAYFORMAT: %FIRSTNAMES (%NICKNAME) %MIDDLENAMES %LASTNAMEPREFIX %LASTNAME %LASTNAMESUFFIX GRAPHICDISPLAYFORMAT: %NICKNAME %LASTNAMEPREFIX %LASTNAME %LASTNAMESUFFIX TABLEVIEWFORMAT: %LASTNAME %LASTNAMESUFFIX, %FIRSTNAMES (%NICKNAME) %MIDDLENAMES %LASTNAMEPREFIX

would not be completely right, because we lack middle names, but at least in reporting we could have control over what is displayed, store this as default and even overwrite when someone marries someone from a different culture.

FIRSTNAMES: Yoko MIDDLENAMES: NICKNAME: LASTNAMEPREFIX: LASTNAMES: Ono TEXTDISPLAYFORMAT: %LASTNAMES %FIRSTNAMES GRAPHICDISPLAYFORMAT: %LASTNAMES %FIRSTNAMES TABLEVIEWFORMAT: %LASTNAMES %FIRSTNAMES

This would fix the cultural fault of writing Yoko Ono to the correct Ono Yoko.

albertemmerich commented 1 week ago

Here we have a proposal for a new structure of NAME, using the substructures NAME_PIECES, and a new substructure "TEXTDISPLAYFORMAT" to build the NAME payload from the NAME_PIECES. This might be a good way to cover very different naming conventions in different cultures, and still enabling to know which part is the surname, given name and so on.

fisharebest commented 1 week ago

I think there is a distinction between "knowing which part of the name is a surname/prefix/suffix/whatever" and "re-constructing a full name from its component parts".

What problem are we trying to solve? Individuals have long names (Richard Herbert Cheney) and shorter versions (Dick Cheney). In some contexts, we'd want the long one. In others, we'd want the shorter one.

In this case, is it simpler to simply store both long and short forms?

1 NAME Richard Herbert /Cheney/ 2 TYPE birth,legal,etc. 1 NAME Dick /Cheney/ 2 TYPE aka,short,etc.

jkr-wrk commented 1 week ago

True.

But in my culture it is normal to: split the last name in two parts to fix the name index: van Nassau -> Nassau, van not to abbreviate all first names except the first (It's C.A.M. de Vries or Cor de Vries or C. de Vries, never Cor A.M. de Vries) have a common name derived from any of the official names, not just the first (Cornelis Antonius de Vries -> Ton de Vries)

In other cultures it is normal to: change the last name depending on the sex (Russia/Iceland) change the last name depending on the first name of father (Iceland) write the last name first in every report (Asia/Hungary)

So many reports try to correct the name to fit the situation, depending on the culture of the writer of the report. But this is impossible if parts of the name are not labeled correctly.

So yes, there should be multiple types of names saved in GEDCOM, that way reports can show names without making assumptions. There are some conventions that try to fix. Like James Paul* /McCartney/ jr. or Richard (Dick) Herbert /Cheney/ or /Ono/ Yoko where slash indicate what part is the familyname, what part is the common name and what part is derived from another part. A dutch solution could be to write Beatrix /Nassau, van/ indicating that the part behind the comma should be in front of the comma, except in the index. Maybe there is a convention to tell what part of the last name is derived from the sex?

And applications could/should make modular input forms to be able to switch between cultural differences. That way if I send my GEDCOM from one application to another the index will still show: Nassau, van (in the last name index) Dick Cheney (in the visual tree) Herbert Richard (Dick) Cheney jr. (In the personal description) Herbert Richard Cheney (In the official person card) Cheney Herbert Richard (In a chinese situation)

All versions could be made with the appropriate cultural form and stored within the GEDCOM without guesswork from the export and import and/or report. Because all the guesswork breaks the culture.

Edit: typo

Norwegian-Sardines commented 1 week ago

In cultures where the surname is written first Asia/Hungary we enter the name like this:

1 NAME /surname/ given name(s)

In applications where you want to display the name “Jon van Nassau” and index the value Nassau, we enter it as follows:

1 NAME Jon /van Nassau/ 2 SURN Nassau

Norwegian-Sardines commented 1 week ago

In cultures that have surnames with gender based ending such as Eastern Europe, we know from Wikipedia:

The -ski ending and similar adjectival endings (-cki, -dzki, -ny, -ty) are the only ones in Polish that have feminine forms, where women have the feminine version ending in -ska (-cka, -dzka, -na, -ta) instead. Historically, female versions of surnames were more complex, often formed by adding the suffix -owa for married women and -ówna or -wianka for unmarried women.

Therefore:

Male: 1 NAME Jan /Smolenski/ 2 SURN Smolen

Female: 1 NAME Maria /Smolenska/ 2 SURN Smolen

albertemmerich commented 1 week ago
1 NAME Maria /Smolenska/
2 SURN Smolen

My application does not offer this, as the payload of NAME always is put together by the NAME_PIECES which the user entered. So I have

1 NAME Maria /Smolenska/
2 SURN Smolenska
jkr-wrk commented 1 week ago

I think a lot of applications look at the SURN tag and think it should contain the surname.

In these examples it looks like you specifically use the SURN tag for function, because the surname is "van Nassau" and I think Smolenska? If it is used for function it should contain "Nassau, van" because the name differs from "Nassau".

Some reports might think showing a short version of the name could be NICK SURN. But that would be wrong.

Norwegian-Sardines commented 1 week ago

I think a lot of applications look at the SURN tag and think it should contain the surname.

In these examples it looks like you specifically use the SURN tag for function, because the surname is "van Nassau" and I think Smolenska? If it is used for function it should contain "Nassau, van" because the name differs from "Nassau".

Some reports might think showing a short version of the name could be NICK SURN. But that would be wrong.

We have discussed that (I believe it has been added to the specification) that the SURN tag can be used to support indexing a name that is gender neutral as the program I use already does. In the case of Nassau vs "Nassau, van" being a different name we had not discussed this. At present the comma would be viewed as a separate name in the SURN tag, but if we were to adopt the ability to use multiple SURN tags (as discussed as well) then the coma would be ok.

Note: crossed out. The current specification does allow the use of multiple SURN tags and does not use a comma delimited list any more, so this would be valid:

1 NAME Jon /van Nassau/
2 SURN Nassau, van
albertemmerich commented 1 week ago

I do not think, "van" should be part of the SURN payload. I have (in GEDCOM 5.5.1 or GEDCOM 7.0)

1 NAME Jon /van Nassau/
2 GIVN Jon
2 SPFX van
2 SURN Nassau

and the application can do all things wanted by users.

Norwegian-Sardines commented 1 week ago
1 NAME Maria /Smolenska/
2 SURN Smolen

My application does not offer this, as the payload of NAME always is put together by the NAME_PIECES which the user entered. So I have

1 NAME Maria /Smolenska/
2 SURN Smolenska

Albert, you must have seen the discussion we had two years ago about the use of the SURN tag as a true indexing value rather than just a duplicate of the NAME tag! The GEDCOM specification already says:

The Personal Name payload shall be seen as the primary name representation, with name pieces as optional auxiliary information; in particular it is recommended that all name parts in PERSONAL_NAME_PIECES appear within the PersonalName payload in some form, possibly adjusted for gender specific suffixes or the like.

The last part of the quote pertains to this discussion. My application allows for indexing and grouping based on the SURN tag that would allow the root name to be entered rather than the gender specific name.

Norwegian-Sardines commented 1 week ago

I do not think, "van" should be part of the SURN payload. I have (in GEDCOM 5.5.1 or GEDCOM 7.0)

1 NAME Jon /van Nassau/
2 GIVN Jon
2 SPFX van
2 SURN Nassau

and the application can do all things wanted by users.

I agree 100% with your indication that van is not in this context (but in others) a Surname Prefix. I corrected my comment above and indicate the following would be ok in v7.0 of GEDCOM.

1 NAME Jon /van Nassau/
2 SURN Nassau, van

The index and grouping would be on "Nassau, van" which would be different than just "Nassau".

tychonievich commented 1 week ago

Discussed in steering committee

The spec currently defined name in terms of splitting the name and structuring its information. It allows "for the payload to contain information not present in any name piece substructure" but does not speak either way to the inverse (a name part that is not part of the name payload). We think that having a SURN that is not part of the name is not appropriate in 7.0, but the spec does not fully define "part of". Elsewhere in this issue's discussion we propose adding not-part-of use in 7.1 and/or 8.0.

g7:SURN is defined as "A family name passed on or used by members of a family." That doesn't appear to use to be aligned with SURN Nassau, van, but that decision would ultimately be up to the person entering the data. Some applications could try to anticipate uses of commas like this, but that is a culturally-specific and localization-specific usage and not covered by the GEDCOM spec itself.