FamilySearch / GEDCOM

Apache License 2.0
153 stars 20 forks source link

Some background for Names, and future considerations. #190

Open Norwegian-Sardines opened 1 year ago

Norwegian-Sardines commented 1 year ago

What is the time frame for v7.1 recommendations regarding new name parts?

Obviously, a person’s identity begins with their name and how they introduce themselves. I would include identity items like.

Rufname, Patronymic, Farm/Location, Clan, House, Tribe to name a few.

-Rufname has been discussed here by the German delegation. -Patronymic is common in many cultures and still used in Iceland and probably by others. -Farm/Location is also historically common. -Clan, House, Tribe are similar and used in European and Asian (probably all cultures) and represent individuals that participate in a grouping that is part of their name and identity.

An example of both Patronymic and Location in one famous name is; Leonardo di ser Piero da Vinci (aka Leonardo da Vinci). Leonardo is his given name, di ser Piero is a Patronymic reference to his father Ser Antonio Piero, and da Vinci is an indicator of birthplace the city of Vinci (not a surname).

Korea has approximately 288 family names and almost 50% of individuals have one of 3 family names. To differentiate between unrelated individuals with the same family name they use a Korean clan village name (called Jipseongchon) to differentiate origin of the family name by village. This has some parallels in ancient times for use when doing census based on town of birth, people had to return to their place of birth to be counted.

In my specific Norwegian examples, individuals are referred to by their given name, patronymic name and their farm location in the community. Their farm location changes as they move from farm to farm. Inheritable surnames we’re not required by law until 1924, and many times members of the same family (parents, grandparents, children) could all take different surnames i.e. 2 brother could take different surnames.

I sure others with specific cultural background can expand on these concepts!

I bring this up as food for thought for any future revisions of naming.

Norwegian-Sardines commented 1 year ago

I’d also like to see the name parts be required in transmissions so receiving programs actually have a chance to understand what values in the NAME actually pertain to the real use rather than a perceived 3 position name bounded by the surname surrounded by ‘/‘.

For example not everything before the “surname” part is a given name and everything after is a suffix. Asian names put family name first and given name second. Patronymic names don’t have surnames and given names may not be just the first word in the string.

fisharebest commented 1 year ago

As a software developer, I don't really care too much about _RUFNAME, patronyms, and other specialised name subtags. I care about:

  1. How to display names. For example, use small capitals for the surname, underline the prefered given name, put quotation marks around a nickname, etc.
  2. How to sort names. e.g. does 'John Smith jr.' sort before or after 'John Smith sr.'
  3. How to group individuals by name. e.g. 'Kowalski' and 'Kowalska' are inflections of the same name.
  4. How to search for names. e.g. a search for McDonald should find Macdonald.

For example, if I have the name Saleh ibn Tariq ibn Khalid al-Fulan, I don't really care that it means Saleh, son of Tariq, son of Khalid, of the famliy Fulan. I just need to know that it sorts as "Fulan, Saleh, Tariq, Khalid" and groups with other "Fulan"s, etc.

Norwegian-Sardines commented 1 year ago

So what tells you how to sort and group this brother and sister: Olaf Yarlsen Bruflott Maria Yarlsdotter Naustdal

or Olaf Yarlsen Bruflott Borghild Casparsen /Feyer/

are married with a son Yarl Brett /Olsen/

and that: Olena Yarlsdotter Naustdal

is not a relation in anyway to any person above if all you have is a NAME payload and no surname designation?

How do you pick out a given name for display without a given name, or that Olaf and Maria are of the same family without an indication of a fathers name.

I would think that without separate payloads indicating their use, parsing the NAME payload would require an indicator directing the parser to use a particular structure, or that some hidden indicator in the NAME payload pointed out what value in the payload was a “preferred” name or what part referred to a father or a location!

fisharebest commented 1 year ago

So what tells you how to sort and group this brother and sister:

Nothing. That's what I'm saying. I need subtags that provide this information. I don't care whether these are surnames, patronyms, farm-names, etc. I just need information that lets me display/sort/group/search them correctly.

tychonievich commented 1 year ago

The use cases identified by @fisharebest and the additional parts requested by @Norwegian-Sardines appear to be largely independent.

  1. Display -- I think this is covered by the current name payload.

  2. Sort -- partially covered by the / in the name payload, but not in all cases; for example, it will not help sort by root name instead of name form.

    Proposal: add INDI.NAME.SORT_AS in 7.1, or a parallel SNAME structure (like SDATE), though the fact that an individual may have many NAMEs but events may only have one DATE makes parallel to the SDATE less than perfect.

  3. Group -- I'm not familiar with this use case. Is it enough to identify which name part types are "grouping" (e.g. SURN, CLAN, HOUSE, TRIBE) and which are not (e.g. GIVN, PATRONYMIC,PLACE)?

  4. Search folding -- I think this might be subsumed by SORT_AS?

  5. Many more name parts

    GEDCOM-X has a solution to this that we might be able to copy. They have just four or five name types (prefix, suffix, given, surname; implicitly also a "none of the above" as these four are all optional) but a large and extensible set of name qualifiers.

    Could we do something like that ourselves? Instead of adding name part tags, add just one more (OTHER or PART or something like that) and give all name parts a {0:M} enumeration value substructure with all of GEDCOM-X's name part qualifiers as permitted values?

  6. "Invisible" names -- some name parts are not included in the written or spoken form of the name, and which ones varies by person not by name part type. See #169 for example place names that are (da Vinci) and are not (Bruflott) written in the name.

    Proposal: add INDI.NAME.(part).HIDDEN Y substructure, or a HIDDEN qualifier if we decide to add qualifiers per the previous item.

Norwegian-Sardines commented 1 year ago
  1. Display - The NAME tag is the VALUE of the display but not HOW the name is to be displayed. This is the classic "Model" vs "View" discussion. The model or data can be found in the NAME tag but the view or display needs some additional information to indicate the "how". So I don't think that the NAME tag is sufficient.

For example: Johan Casper /Freyer/ and Johan Andreas /Freyer/ are brothers. In the Freyer family every generation has one son named Johan because it is an important name and if a son named Johan dies as a very young person a newborn son would be named Johan to carry the tradition with a different second name that would be used as a "call name". Nowhere in the NAME tag does a preferred (or Rufname) get indicated so some other "required" indicator must be transmitted. Before computers and in early GEDCOM (pre v5.4) many of us entered additional character after the name part that provided an indicator that a part was a preferred name, rufname, nickname, or maiden name, these indicators still show up in genealogies today. When I was taught genealogy back in the early 1980's we used an asterisk after a preferred name, underlined a rufname, quoted a nickname, and wrote "nee" (female) or "ne" (male) before a birth name. When newer GEDCOM specifications allowed multiple NAME tags and some subtags these indicators became less popular.

  1. Sorting and Grouping are both functions of the allowed NAME:. These subtags should be required in all transmissions, some of these sorting and grouping values are based on SURN, HOUSE, TRIBE, CLAN (which should be include in the NAME:) and could also include GIVN and others for applications that would like to provide statistical information about the usage of a data part. I for one would love to have a count of everyone that was named after Johan (male and female 'Johanna') an "Namesake".

  2. GEDCOM-X

    Could we do something like that ourselves? Instead of adding name part tags, add just one more (OTHER or PART or something like that) and give all name parts a {0:M} enumeration value substructure with all of GEDCOM-X's name part qualifiers as permitted values?

Yes a possibility, except I would make these parts required rather than optional. Transmitting just a NAME or not expecting to have subtags as import provides no value to GEDCOM as a way to transfer data between applications. If an application drops important display, sorting/grouping data, and other name parts or fails to transmit them data will be lost. Many applications that do not support the current v5.5.1 specification already drop or mishandle too many values to make GEDCOM a viable transmitter of data between applications.

fisharebest commented 1 year ago
  1. Display -- I think this is covered by the current name payload.

We cannot underline the prefered name without additional info. I use an asterisk suffix for this - John Peter* /Smith/.

As I said earlier, I don't care if it is a rufname, family tradition, personal preference or otherwise. If we are going to use subtags, then this suggests that instead of 2 _RUFNAME, it might be better to have 2 PREFERED_GIVN xx/3 TYPE RUFNAME.

clarkegj commented 1 year ago

Norwegian-Sardines, fisharebest,

I would like to schedule a zoom meeting to discuss various zooming insures with a member of our team that is in Europe next week. What times would work for you.

Monday 6:00am 6:30am 7:00am 7:30am

Tuesday 6:00am 6:30am 7:00am 7:30am

Please let me know what times will work ASAP,

Thanks

Gordon Clarke GEDCOM Developer Manager FamilySearch