Open tychonievich opened 3 months ago
Thanks Luther for creating this issue. I think I am the one that may have started the conversation to move to the use of <fact>.TYPE
to begin managing the proliferation of new event types. With the addition of the KIND
tag to enumerate various “like” events as an alternative to using TYPE
I personally think we have a winner!
I’ll let others weigh in with their comments, but I’m on board to discuss this option above others going forward!
When I brought up adding more event and attribute types in #117 the primary reason for doing so was related to information context. Having a far richer set of specific enumerated types helps preserve context when data is shared between people from different countries.
Think about it in terms of tagging data for machine learning, you want and need the tags to be applied consistently across languages and cultures. And the finer, the more detailed the tagging, the more context is preserved and value can be extracted.
Should that be accomplished with a flat namespace or a hierarchical namespace? Both have benefits and drawbacks. After some consideration I think the later, if well thought out, will have more long term benefits. However, as almost every genealogical site and application today uses a flat namespace I think changing that should be a 8.0 item. Ideally I would like to see shared events and groups in 8.0 as well, but I know that is wishful thinking.
In the end the primary responsibility of Gedcom is to serve as a data transmission envelope. It will always be a lossy envelope, but each iteration should strive to further improve fidelity.
This is not really a new issue; it is an effort to collect topics from several other issues and discussions and add context for those who haven't followed those conversations. Those other issues are scattered and I'm not confident that I found them all; if I've missed something, please add it here!
If you want to propose specific new events or attributes, see the event and attribute proposal tracker.. If you want to see a large list of possible new attributes and events, see #117. There are also various issues discussing specific new events or substructures thereof. This issue is instead about reorganizing the entire event/attribute system.
The challenge/situation
GEDCOM currently has 47 event/attribute types (32 event types, 15 attribute types), not counting the generic EVEN and FACT structures. More than 200 additional types have been proposed in various issues here (notably #117).
Long lists of options make user interfaces challenging to create and user decisions hard to guide. For some applications they may also make code lengthy with increased chance of accidentally omitting or cross-coding some component.
The current set has some quite broad types, like
DESC
which subsumes multiple extension types some applications support (such as_COLO
,_EYEC
,_HAIR
,_HEIG
,_WEIG
); and some surprisingly narrow, likeBARM
andBASM
being distinct. This inconsistency in specificity helps fuel discontent, with those who like specificity wondering why the level of specificity they find in one place is not present in another; and those who like generic flexibility having complementary wondering.Any type with a high degree of specificity makes translation and multicultural communication challenging.
The current set are not uniformly understood. Some structures have definitions that do not map well to non-English languages or non-Christian-European cultures. Some users apply the closest available structure to each situation not formally covered in the specification, such as using MARB for any marriage-announcement-like event even if it is informal and not a bann, while other users do not do this.
These topics are not restricted to events and attributes; calendars and name parts have both had similar discussions, but with fewer types in the 7.0 specification.
Five Proposed Solutions
Add all the events and attributes that come to mind.
This approach was rejected by the GEDCOM steering committee in April 2023, at which time the "valuable, absent, and used" criteria were introduced for discussing new event proposals. But that doesn't mean it couldn't be revisited.
Add all the events and attributes that multiple applications support.
This approach is implicitly the intent of the event and attribute proposal tracker and the "awaiting use" label in the issue tracker.
Use only a small number of types; any additional clarification goes in a free-text field like
TYPE
orNOTE
.While option has been mentioned in passing, I'm not aware of any serious proposal along these lines.
Create a type hierarchy. For example a Marriage Bann ⇒ Marriage Announcement ⇒ Pre-Marriage Event ⇒ Marriage Event ⇒ Family Event ⇒ Event, where "⇒" means "is a subtype of" or "implies" or "is subsumed by".
This approach is used in some peer specification, notably schema.org, but has not received much discussion here.
Create a smaller set of broad types, with optional enumerated-value subtypes in a
KIND
substructure.This has two parts:
defining the smaller set of broader types. Several have been proposed:
290
303
315
adding the KIND substructure. A concrete proposal can be found in #322.
An additional open question is if the enumerations would be singular or plural. We could do any of the following:
This proposal has received the most discussion, but also has the most open questions.
Solution implementation options
Assuming we converge on a solution that we like, we could do any of the following:
In 7.1 we could
KIND
to the existing event typeIn 8.0 we could replace and refactor as much as we wish.
No matter what we do, it is likely that applications will wish to support exporting new data in 7.0 and earlier formats. For example, if we deprecate or remove the
MARB
in favor of some broader structure with aKIND
we should be clear whatKIND
values imply this can be exported as aMARB
.