FamilySearch / GEDCOM

Apache License 2.0
153 stars 20 forks source link

Use of “*” to indicate preferred name in NAME tag #150

Open Norwegian-Sardines opened 2 years ago

Norwegian-Sardines commented 2 years ago

In many customs a person is given a series of given names, for example: Wilhelm Andreas Johanas …

Where Wilhelm is a name of some ancestor or family member of prominence given to this person to remember that ancestor by and Andreas is the given name that the individual will commonly be known by.

It is very possible for two boys (sometimes girls) in a family to have the same name of Wilhelm but with different additional names, for example:

Wilhelm Andreas Johanas Simonsen Wilhelm Casparsen Johanas Simonsen

So the GEDCOM would look like this:

0 INDI 1 NAME Wilhelm Andreas* Johanas Simonsen 2 TYPE birth 2 GIVN Wilhelm Andreas Johanas

0 INDI 1 NAME Wilhelm Casparsen* Johanas Simonsen 2 TYPE birth 2 GIVN Wilhelm Casparsen Johanas

I have been active in genealogy and studying under genealogists and historians since the early 1980’s and learned from them that placing an asterisk after the preferred given name was the sanctioned way to denote a preferred named.

Will this be supported under GEDCOM v7.0?

While not exactly the same, this concept is also similar to the German Rufname!

ghost commented 2 years ago

This came up in the early discussions for GEDCOM v7. The official way of denoting /rufnamen /was using underscore, and it was proposed not only that this could be denoted on the NAME with a lightweight markup (e.g. Johanas) but that other forms of markup could be used there, e.g. "alias". The existing /surname/ was already a form of markup in use there.

Interestingly, the period after an initial is effectively a form of markup as it distinguishes an initial from a single-character name token (as in J Harlen Bretz, or Malcolm X).

Tony Proctor

On 01/06/2022 15:36, Norwegian-Sardines wrote:

In many customs a person is given a series of given names, for example: Wilhelm Andreas Johanas …

Where Wilhelm is a name of some ancestor or family member of prominence given to this person to remember that ancestor by and Andreas is the given name that the individual will commonly be known by.

It is very possible for two boys (sometimes girls) in a family to have the same name of Wilhelm but with different additional names, for example:

Wilhelm Andreas Johanas Simonsen Wilhelm Casparsen Johanas Simonsen

I have been active in genealogy studying under genealogists and historians since the early 1980’s and learned from them that placing an asterisk after the preferred given name was the sanctioned way to denote a preferred named.

Will this be supported under GEDCOM v7.0?

While not exactly the same, this concept is also similar to the German Rufname!

— Reply to this email directly, view it on GitHub https://github.com/FamilySearch/GEDCOM/issues/150, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDJB3VVE4JLBQ3ZMY2D5VTVM5YNJANCNFSM5XRPVV6A. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Norwegian-Sardines commented 2 years ago

Do we still have access to these early discussions regarding Rufname?

Will the convention of putting an underscore before and after a Rufname ( for example: Johanas ) be coded into The Standard, so that application programmers don't ignore the information an/or some other sudo standards body tries to make their own "Standard"?

While my understanding of the "Rufname" is limited it does not sound exactly like the use case I'm talking about!

I understand the point about the existence (or non-existence) of a period after a single letter has different meanings, these type of "understandings" just like the "/" surrounding the surname need to be codified in some way in The Standard, or they will be lost to time!

ghost commented 2 years ago

I'm not sure how much of those discussions was kept, or whether they would ever be made public. The markup suggestion was eventually rejected in place of a RUFNAME item type, but then that was dropped prior to release due to pushback from stakeholders (too big a change for them at the moment).

Tony Proctor

On 01/06/2022 17:59, Norwegian-Sardines wrote:

Do we still have access to these early discussions regarding Rufname?

Will the convention of putting an underscore before and after a Rufname ( for example: /Johanas/ ) be coded into The Standard, so that application programmers don't ignore the information an/or some other sudo standards body tries to make their own "Standard"?

While my understanding of the "Rufname" is limited it does not sound exactly like the use case I'm talking about!

I understand the point about the existence (or non-existence) of a period after a single letter has different meanings, these type of "understandings" just like the "/" surrounding the surname need to be codified in some way in The Standard, or they will be lost to time!

— Reply to this email directly, view it on GitHub https://github.com/FamilySearch/GEDCOM/issues/150#issuecomment-1143879062, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDJB3XME2QG77BAKEQL4VLVM6JIDANCNFSM5XRPVV6A. You are receiving this because you commented.Message ID: @.***>

Norwegian-Sardines commented 2 years ago

too big a change for them at the moment

In other words, "kick the can down the road" and maybe no-one will complain. Sounds like the same discussion I read on the list-serve back in the 1980's!

dthaler commented 2 years ago

Will this be supported under GEDCOM v7.0?

I think the closest reliable thing you could do now (in 7.0 and in 5.5.1 for that matter) is have a second NAME structure that contains only the preferred name, such as:

0 INDI
1 NAME Andreas Simonsen
2 GIVN Andreas
1 NAME Wilhelm Andreas Johanas Simonsen
2 TYPE birth
2 GIVN Wilhelm Andreas Johanas

Punctuation cannot be reliably used since it could mean other things than a preference.

Norwegian-Sardines commented 2 years ago

Then I would suspect that this is the same solution for Rufname!

This would also be a reason to remove the NICK tag altogether since it has no value as well and could be used:

0 INDI 1 NAME John /Wayne/ 1 NAME Marion Robert /Morrison/ 2 TYPE birth 1 NAME Duke /Wayne/ 2 TYPE aka

dthaler commented 2 years ago

Then I would suspect that this is the same solution for Rufname!

This would also be a reason to remove the NICK tag altogether since it has no value as well and could be used:

0 INDI 1 NAME John /Wayne/ 1 NAME Marion Robert /Morrison/ 2 TYPE birth 1 NAME Duke /Wayne/ 2 TYPE aka

In issue #134, I asked a very similar question and the answer was that NICK and TYPE AKA have different meanings.

Norwegian-Sardines commented 2 years ago

Or then: 0 INDI 1 NAME John /Wayne/ 2 TYPE preferred 1 NAME Marion Robert /Morrison/ 2 TYPE birth 1 NAME Duke /Wayne/ 2 TYPE nickname

But Not 1 NAME Marion "Duke" Robert /Wayne/

Since The definition of the NAME tag (like all fields in any database) should kept simple and be either:

  1. The Birth Name (with or without a TYPE)
  2. An alternate name defined by the TYPE tag

I realize these are not v7.0 constructs, "John Wayne" could also be a "2 TYPE stage name" as well.

the answer was that NICK and TYPE AKA have different meanings.

I would love to understand how a Nickname is different than an "Also Know As" name! The Urban Dictionary says:

[Also Known As] This is a phrase used to introduce aliases, nicknames, working names, legalised names, author’s pen names and so on. Identical in meaning to the old English word Yclept it is often abbreviated to AKA.

tychonievich commented 2 years ago

Tony is correct that we had a representation for rufnames using markup during pre-7.0 planning. The markup was then replaced by a RUFNAME name part. The entire NAME structure was then replaced with a more flexible GEDCOM-X-like sequence of typed parts. We circulated this design to the companies listed in the contributors section of the 7.0 specification and got strong pushback: evidently many of them had 5.5.1's NAME design fairly deep in their codebase and our proposed change caused some to say they'd not support 7.0 and some others to say it would delay support significantly. We thus rolled 7.0's names back to 5.5.1 before releasing 7.0.

In 7.0, the only special characters in name payloads are slashes. Quotes, asterisks, underscores, and the like can be entered by users, of course, as can any other character they wish to type; but these are likely to be treated like letters (including changing sorting and matching results) by some applications, so while technically permitted by the 7.0 spec, that kind of ad-hoc markup is not encouraged.

Having multiple NAME structures as @dthaler suggested is certainly an option now. We also know of some 7.0 applications that are using the GEDCOM-L group's 5.5.1 RUFNAME extension to indicate rufname as an extension tagged name part.

RUFNAME, PATRONYMIC, MATRONYMIC, and other new name parts might be added in a future minor release. At present, the steering committee does not expect to release 7.1 until 2023 at the earliest, and we have not yet identified the specific additions it will contain.

I currently expect 8.0 will go back to the list-of-parts name structure we originally proposed for 7.0. Because part of the value of standards is the interoperability created by their stability, 8.0 is not likely to be created in the next few years, which reduces my confidence in what it will contain.

tychonievich commented 2 years ago

In re AKA vs NICK, the semantics of these were unclear in 5.5.1, which means they are used in multiple ways in extant GEDOM files and trying to define them more clearly will also be incorrectly reinterpreting some existing files. That said, we generally see

ghost commented 2 years ago

I was never entirely happy with the RUFNAME name-part, Luther, but this may have been symptomatic of something bigger. A rufname is still a given-name and so you can argue that you should be qualifying it rather than identifying it. That is, it's still a given-name but used in a special way, and that would require two bits of meta-data per name-part: the fundamental type (GIVN) and some supplementary properties (such as RUFNAME). I was wondering if that might be applicable to surnames as well.

[Apologies to Albert for shacking the tree here -- I know they already do it a different way in Germany]

Tony

On 07/06/2022 14:40, Luther Tychonievich wrote:

Tony is correct that we had a representation for rufnames https://en.wikipedia.org/wiki/German_name#Forenames using markup during pre-7.0 planning. The markup was then replaced by a RUFNAME name part. The entire NAME structure was then replaced with a more flexible GEDCOM-X-like sequence https://github.com/FamilySearch/gedcomx/blob/master/specifications/conceptual-model-specification.md#name-form of typed parts https://github.com/FamilySearch/gedcomx/blob/master/specifications/name-part-qualifiers-specification.md. We circulated this design to the companies listed in the contributors section of the 7.0 specification and got strong pushback: evidently many of them had 5.5.1's NAME design fairly deep in their codebase and our proposed change caused some to say they'd not support 7.0 and some others to say it would delay support significantly. We thus rolled 7.0's names back to 5.5.1 before releasing 7.0.

In 7.0, the only special characters in name payloads are slashes. Quotes, asterisks, underscores, and the like can be entered by users, of course, as can any other character they wish to type; but these are likely to be treated like letters (including changing sorting and matching results) by some applications, so while technically permitted by the 7.0 spec, that kind of ad-hoc markup is not encouraged.

Having multiple NAME structures as @dthaler https://github.com/dthaler suggested is certainly an option now. We also know of some 7.0 applications that are using the GEDCOM-L group's 5.5.1 RUFNAME extension https://genealogy.net/GEDCOM/GEDCOM551%20GEDCOM-L%20Addendum-R2.pdf#%5B%7B%22num%22%3A56%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22XYZ%22%7D%2C68%2C669%2C0%5D to indicate rufname as an extension tagged name part.

RUFNAME, PATRONYMIC, MATRONYMIC, and other new name parts might be added in a future minor release. At present, the steering committee does not expect to release 7.1 until 2023 at the earliest, and we have not yet identified the specific additions it will contain.

I currently expect 8.0 will go back to the list-of-parts name structure we originally proposed for 7.0. Because part of the value of standards is the interoperability created by their stability, 8.0 is not likely to be created in the next few years, which reduces my confidence in what it will contain.

— Reply to this email directly, view it on GitHub https://github.com/FamilySearch/GEDCOM/issues/150#issuecomment-1148692196, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDJB3RHRNLXCF2ZRQDUBZTVN5GNJANCNFSM5XRPVV6A. You are receiving this because you commented.Message ID: @.***>

albertemmerich commented 2 years ago

Tony, I have no problem with any better solution for RUFNAME as we decided in GEDCOM-L. Our goal was to stay within 5.5.1 standard, and to "mark" one of the given names as RUFNAME (as has been done in official German documents by underlining the RUFNAME within the given names).

Example: Given names Albert Wilhelm. RUFNAME Albert - this is underlined in the document.

Solution 1:

2 GIVN _Albert Wilhelm

problem in 5.5.1 (and 7.0): is a character which may be put in by the user in any other context, not marking RUFNAME. If we want to have a markup by (or for preferred name parts...), the charcaters _ (and ) must be reserved for this mark up to uniquely identify it as mark up, and not part of the NAME itself.

Solution 2: (decided by GEDCOM-L in 5.5.1, and again used in 7.0):

2 GIVN Albert Wilhelm
3 _RUFNAME Albert

The user-defined tag _RUFNAME marks the given name, which is underlined in documents (which can be done in reports of the genealogical data using the _RUFNAME information). Our problem is: The application has to ensure, that the paylod of _RUFNAME must be part of the GIVN names. It is NOT allowed to export:

2 GIVN Albert Wilhelm
3 _RUFNAME Bert

So if any future standard version will reserve any mark up characters for this purpose, I would be happy to have

2 GIVN _Albert Wilhelm

Some applications allow this as "shortcut" user input and convert it for export to the _RUFNAME structure...

Albert

albertemmerich commented 2 years ago

In 7.x we could have an alternate solution (as we allowed NAME_PIECES as {0:M} where 5.5.1 had {0:1}):

2 GIVN Albert
3 TYPE RUFNAME
2 GIVN Wilhelm

At the time we postponed the RUFNAME problem to a later GEDCOM version, we had the {0:1} definition in mind!

I think it would be a better solution to have TYPE on level 3 for this: We do not need a second NAME structure with a TYPE on level 2 ...

ghost commented 2 years ago

That's the sort of solution I was hoping for. Thanks Albert

Tony

On 07/06/2022 23:24, Albert Emmerich wrote:

In 7.x we could have an alternate solution (as we allowed NAME_PIECES as {0:M} where 5.5.1 had {0:1}):

|2 GIVN Albert 3 TYPE RUFNAME 2 GIVN Wilhelm |

At the time we postponed the RUFNAME problem to a later GEDCOM version, we had the {0:1} definition in mind!

I think it would be a better solution to have TYPE on level 3 for this: We do not need a second NAME structure with a TYPE on level 2 ...

— Reply to this email directly, view it on GitHub https://github.com/FamilySearch/GEDCOM/issues/150#issuecomment-1149232869, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDJB3QQ34XYWGLN52GLPJDVN7D3PANCNFSM5XRPVV6A. You are receiving this because you commented.Message ID: @.***>

clarkegj commented 2 years ago

Norwegian-Sardines, We appreciation you comments and contributions to Github.com/Familysearch/GEDCOM. Please check out the Project Teams for future versions of FamilySearch GEDCOM that we are organizing. (see https://gedcom.io/community. I recommend that you apply to participate in the Sources Team by sending an email to GEDCOM@familysearch.org and put "Sources" in the subject.

Norwegian-Sardines commented 2 years ago

In re AKA vs NICK, the semantics of these were unclear in 5.5.1, which means they are used in multiple ways in extant GEDOM files and trying to define them more clearly will also be incorrectly reinterpreting some existing files. That said, we generally see

Thank you for this explanation, however I’m not convinced that they (nick vs aka) are really any different, rather, I think that this is a hold over from the PAF days and a very early GEDCOM when multiple NAME tags were not supported and they needed a place (and a definition for) this common data point to be stored.

As a former database designer I can relate to older concepts of, one name per individual, but to normalize a database were usually the NAME field represents one concept of a name, including a “nickname” as part of the NAME breaks this design.

My point is, where does the definition of a nickname start and stop? You indicate “Peggy” for Margaret, but is the nickname “Junior”, “Red”, “Big John”, “Shorty”, etc. are still part of the main entry of the NAME? What happens when a person has multiple “nicknames” in their life? What about indexing or name use statistics. What about data entry persons (family historians) who are not familiar with a specific cultural tradition where a name that feels like an “also known as” but is really a nickname?

I understand that the usage above wants to divide nickname and AKA by saying nick is for “familiar” names while AKA is for aliases and names that depart drastically from their birth name, and I understand that some want to make these distinction. But from a database design conceptualization these should be treated as “other name entries”. Margaret Hansen would never call her Margaret “Peg” Hansen but would call herself either, Margaret Hansen OR Peg Hansen.

From my point of view a nickname is just a fancy way of saying “also known as”, the NICK tag and it’s participation as a Subtag of NAME should be eliminated and a separate NAME entry should be generated. This has been my standard since the early 1980’s.

dthaler commented 2 years ago

Discussion 14 JUN 2022: @dthaler to create a pull request to add an entry to the Technical FAQ with two examples, including:

2 GIVN Albert Wilhelm
2 _RUFNAME Albert

and

0 INDI
1 NAME Andreas Simonsen
2 GIVN Andreas
1 NAME Wilhelm Andreas Johanas Simonsen
2 TYPE birth
2 GIVN Wilhelm Andreas Johanas

but ideally use the same name in both examples.

albertemmerich commented 2 years ago

As discussed 14 JUN 2022, the Gedcom-L group is using _RUFNAME on level 2, not on level 3 as shown in my comment last week. So the correct version is (as shown by Dave in his last comment):

Solution 2: (decided by GEDCOM-L in 5.5.1, and again used in 7.0):

2 GIVN Albert Wilhelm
2 _RUFNAME Albert

see our ADDENDUM on page 24 with another example.

Gedcom-L discussed the topic how to handle the German "Rufname" with 7.0 and decided to use solution 2 with 7.0, too. We will not use the possibility of cardinality {0:M} for the tag GIVN to implement another solution for Rufname, and will wait for an official solution in a GEDCOM 7.x or 8.x version of the standard. By this we want to avoid modifying the RUFNAME model twice within a short period. We are open for a solution in 7.x as proposed

2 GIVN Albert
3 TYPE RUFNAME
2 GIVN Wilhelm

but will wait for the NAME team and the decisions for FamilySearch GEDCOM.

dthaler commented 2 years ago

This is now addressed in https://gedcom.io/techfaqs/ which is the best we can do with 5.5.1 and 7.0.

In the future we can consider an update for say 7.1 to make a better way to denote this.