FamilySearch / GEDCOM

Apache License 2.0
156 stars 21 forks source link

Allow FAMC.PEDI to indicate which spouse it refers to #338

Open tychonievich opened 1 year ago

tychonievich commented 1 year ago

The INDI.FAMC structure allows several substructures to clarify the relationship between parents and child: PEDI to note its kind, STAT to note its confidence, and NOTE``/SNOTE for notes. However, it does not provide a standard way to indicate which parent is being discussed.

For example, we can indicate both birth and foster families as such

0 @I1@ INDI
1 FAMC @F1@
2 PEDI BIRTH
1 FAMC @F2@
2 PEDI FOSTER

but we cannot indicate with this structure that one partner of the FAM is the birth parent and the other is not.

For adoption we can use the ADOP.FAMC.ADOP:

0 @I2@ INDI
1 ADOP Y
2 FAMC @F3@
3 ADOP HUSB

be we don't have a similar tool for other cases.

 

I propose we add a ADOP.FAMC.ADOP-like structure with its same four options to go under PEDI, FAMC-STAT, BIRT.FAMC, and CHR.FAMC. I tentatively propose the tag PARENT, though I'm not crazy about that tag; alternatives welcome. Combined with the change in cardinality of PEDI from #256/#274, this would let us do things like

0 @I3@ INDI
1 FAMC @F4@
2 PEDI BIRTH
3 PARENT WIFE
1 FAMC @F5@
2 PEDI FOSTER
3 PARENT BOTH
2 PEDI ADOPTED
3 PARENT HUSB

This may also help resolve #334 which requests per-parent PEDI-like information

dthaler commented 1 year ago

My opinion is that the PARENT.BOTH model is too limited because it doesn't allow a per-parent substructure. For comparison, https://www.familysearch.org/developers/docs/api/types/xml_fs_ChildAndParentsRelationship shows that FamilySearch FamilyTree (and similarly GEDCOM X) does allow a per-parent substructure. That allows specifying a different source, or other information, per parent.

dthaler commented 1 year ago

Discussion 8/17/2023: PEDI tags are now {0:M} in 7.1 in #274 so it can now be used with parent-specific substructures, which resolves my concern.

tychonievich commented 12 months ago

Discussed in steering committee

We want to proceed with this to fill the need currently motivating the _FREL and _MREL extensions. _FREL and _MREL are currently not the ideal solution because they overlap in meaning with PEDI.

There is some potential overlap in meaning between one PEDI with BOTH and two, one with HUSB and one with WIFE. In particular, consider

1 FAMC @F1@
2 PEDI FOSTER

This option is what 5.5.1 and 7.0 has and does not specify which parents are meant. We expect many of them implied both parents, but can't assert that in general

1 FAMC @F1@
2 PEDI FOSTER
3 PARENT BOTH

This option is more compact

1 FAMC @F1@
2 PEDI FOSTER
3 PARENT HUSB
2 PEDI FOSTER
3 PARENT WIFE

This option is more versatile

To resolve this, we propose not including the BOTH value, only HUSB and WIFE, and recommending including two PEDI if a user wishes to explicitly indicate that both partners are meant.

tychonievich commented 11 months ago

As I drafted a PR for this, I can no longer follow our rationale against BOTH. As far as I can tell, we were opposed to having the ability to split a structure into two structures that jointly provide the same information. But that is something done all over GEDCOM: any time there's a {0:M} cardinality structure with a {0:M} cardinality substructure, or a <List:Enum> payload with {0:M} cardinality, or the g7:ALIA structure in any form we are faced with the same potential for someone to decide to split a structure into pieces or not. I don't see how this is different, and am not happy with requiring duplicate information (repeated DATE and STAT) if we want to assert the same PEDI applies to both parents.

In case I have misunderstood/misremembered the rationale agaisnt BOTH, there are two other approaches we could take to avoid duplicating information: make PEDI.PARENT either a <List:Enum> or have {0:M} cardinality.

1 FAMC @F1@
2 PEDI FOSTER
3 PARENT HUSB, WIFE
1 FAMC @F1@
2 PEDI FOSTER
3 PARENT HUSB
3 PARENT WIFE
dthaler commented 11 months ago

As I drafted a PR for this, I can no longer follow our rationale against BOTH. As far as I can tell, we were opposed to having the ability to split a structure into two structures that jointly provide the same information. But that is something done all over GEDCOM: any time there's a {0:M} cardinality structure with a {0:M} cardinality substructure, or a <List:Enum> payload with {0:M} cardinality, or the g7:ALIA structure in any form we are faced with the same potential for someone to decide to split a structure into pieces or not. I don't see how this is different, and am not happy with requiring duplicate information (repeated DATE and STAT) if we want to assert the same PEDI applies to both parents.

In case I have misunderstood/misremembered the rationale agaisnt BOTH, there are two other approaches we could take to avoid duplicating information: make PEDI.PARENT either a <List:Enum> or have {0:M} cardinality.

1 FAMC @F1@
2 PEDI FOSTER
3 PARENT HUSB, WIFE
1 FAMC @F1@
2 PEDI FOSTER
3 PARENT HUSB
3 PARENT WIFE

The rationale was just to remove multiple ways of doing the same thing. So BOTH vs separate info for each, are two ways of doing the same thing, and we already have to support separate info for each in the case where the info is different for each. I agree that either of the two approaches quoted above are better than BOTH. The first one (using a list) seems overkill for a list whose max size is 2. The second one seems fine to me, and supports the case where extensions put substructures under the PARENT tag and want different info per parent.

Norwegian-Sardines commented 11 months ago

Off the top of my head I question the removal of “BOTH” vs the two proposed options as they relate to create SQL queries against the column FAMC.PEDI. PARENT

1 FAMC @F1@
2 PEDI FOSTER
3 PARENT HUSB, WIFE

And

1 FAMC @F1@
2 PEDI FOSTER
3 PARENT HUSB
3 PARENT WIFE

Whereas:

1 ADOP Y
2 FAMC @F3@
3 ADOP HUSB

Provides a single column for this data! I rather see:

1 FAMC @F1@
2 PEDI [ADOP | FOSTER | BIRTH | … ]
3 PARENT BOTH
elyoh commented 11 months ago

Parent-child links The original proposal seeks to allow the INDI.FAMC structure to describe the pedigree of the associated parent-child links separately, along with any changes over time (https://github.com/FamilySearch/GEDCOM/issues/256, https://github.com/FamilySearch/GEDCOM/issues/339).

Therefore it no longer makes sense to begin by stating a PEDI and whether this applies to the FAM.HUSB, FAM.WIFE, or both. Instead, the pedigree history for each parent should be clearly and separately described. A suggested approach is:

+1 FAMC @<XREF:FAM>@                     {0:M}    g7:INDI-FAMC
     +2 HUSB                             {0:1}
        +3 PEDI <Enum>                   {0:M}    g7:PEDI
           +4 STAT                       {0:1}    g7:FAMC-STAT
           +4 DATE <DateValue>           {0:1}    g7:DATE
     +2 WIFE                             {0:1}
        +3 PEDI <Enum>                   {0:M}    g7:PEDI
           +4 STAT                       {0:1}    g7:FAMC-STAT
           +4 DATE <DateValue>           {0:1}    g7:DATE

In the simplest form, this is no more verbose than the PEDI per parent in the original proposal.

Example:

1 FAMC @F1@
2 HUSB
3 PEDI BIRT
2 WIFE
3 PEDI BIRT

More complex cases:

1 FAMC @F1@
2 HUSB
3 PEDI FOSTER
4 DATE FROM 1997 TO 1999
3 PEDI ADOP
4 DATE FROM 2000
2 WIFE
3 PEDI BIRT

The ordering of the PEDI or dates could be used to infer the latest / preferred / current status and may benefit applications where only one pedigree status may be stored / described at a time.

elyoh commented 11 months ago

Family-event links With any of the proposed changes to the INDI.FAMC structure, it is not necessary for ADOP.FAMC, BIRT.FAMC, CHR.FAMC and SLGC.FAMC to carry full information about pedigree as this simply ends up facilitating two ways to describe the pedigree. The INDI.FAMC structure is the natural place for pedigree rather than in any events which happen to be associated with that.

Moreover, whilst it may be desirable to link certain events to the associated FAM, this option is not available for generic events (e.g. an event corresponding with PEDI FOSTER). In such a case, the links to the parents involved would need to be made using the ASSO with ROLE. This may be a better candidate for linking individuals involved in these 'child-family events' rather than FAMC.

Norwegian-Sardines commented 11 months ago

While I agree that the informational representation makes sense, from a data normalization standpoint it is incorrect!

In this model, the subtags for FAMC should/must be broken out into a new record type to provide for the M-M relationship. This is why the ADOP event tag with its underlying subtags must be maintained and can provide for multiple adoption events for an individual. The addition of a FOST (foster) event tag with similar subtags as ADOP would also be warranted!

I would recommend against the proposed changes to the FAMC to include unnormalized subtags.

Norwegian-Sardines commented 11 months ago

ADOP.FAMC, BIRT.FAMC, CHR.FAMC and SLGC.FAMC are all events and event based data (date, place, source information, notes, restriction, religious affiliation, etc) should be maintained only with the event. Splitting information as suggested would be the wrong approach!

elyoh commented 11 months ago

There is no suggestion of splitting information from an event. Events are permitted to include one or more ASSO structures to define the role of other individuals in the event. It can already be exploited to link any pedigree related event to one or both of the associated parents without the need for the use of pointers to a given FAM record.

1 EVEN
2 TYPE Foster
2 ASSO @I2@
3 ROLE FATH

1 ADOP
2 ASSO @I2@
3 ROLE PARENT
2 ASSO @I3@
3 ROLE MOTH

There is also nothing in these suggestions preventing multiple adoptions or other pedigree related events, if this were appropriate.

The current state is that a GEDCOM reader needs to interpret all ADOP.FAMC, SLGC.FAMC, BIRT.FAMC, CHR.FAMC and INDI.FAMC structures, just to be sure it can determine the correct pedigree for a given parent-child relationship. This is not desirable and must be considered in the solution to this issue.

Norwegian-Sardines commented 11 months ago

The current state is that a GEDCOM reader needs to interpret all ADOP.FAMC, SLGC.FAMC, BIRT.FAMC, CHR.FAMC and INDI.FAMC structures, just to be sure it can determine the correct pedigree for a given parent-child relationship. This is not desirable and must be considered in the solution to this issue.

What is wrong with checking and actually reading the information already found in the ADOP, SLGC, … events where the information should be contained? My software already does this when building the relationships between the INDI and FAM record.

You must go to the ADOP event or any other EVEN.FAMC to check if the event is restricted before making the connection between the individual and the family record when displaying the connection, so if go there for that then just go there for who did the adoption, when and where!

Norwegian-Sardines commented 11 months ago

And why must you indicate FATH or MOTH? That can be found when going to the family record and checking the link for SEX, or alternatively a Gender if added to the GEDCOM Standard. If you indicate MOTH, which mother in a same sex adoption?

elyoh commented 11 months ago

What is wrong with checking and actually reading the information already found in the ADOP, SLGC, … events where the information should be contained? My software already does this when building the relationships between the INDI and FAM record.

It is assumed that all contributors are capable of writing code which can interpret GEDCOM which is compliant with the specifications as written (as indeed all my applications do). This does not imply the current design is desirable or should be retained. The requirement to query multiple structures needlessly increases the overhead to accurately determine the pedigree of a given parent-child link.

GEDCOM 7.0 did not make any substantive changes to FAMC related structures, retaining the GEDCOM 5.4+ behaviour. It is perfectly reasonable to question aspects of a 27 year old design which have not seen consistent or robust implementation by developers (as exemplified by prolific use of _FREL and _MREL tags).

You must go to the ADOP event or any other EVEN.FAMC to check if the event is restricted

No version of GEDCOM to date has provided a mechanism to assert a RESN relating to the pedigree status between a child and their parent.

And why must you indicate FATH or MOTH? That can be found when going to the family record and checking the link for SEX, or alternatively a Gender if added to the GEDCOM Standard.

A description of 'mother', 'father' may not be be indicated by the biological sex of a parent. The examples given show the optional use of ASSO with ROLEs of FATH, MOTH, and PARENT thus supporting any 'biological sex' or 'gender identity' pertaining to the associated parents, independently of their 'INDI.SEX' value.

If you indicate MOTH, which mother in a same sex adoption?

This mother is clearly indicated by the individual pointed to by the parent ASSO. Nothing prevents two ASSO structures, pointing to different individuals, with ROLE of MOTH. Nothing suggested precludes adoption by a same sex couple.

Norwegian-Sardines commented 11 months ago

No version of GEDCOM to date has provided a mechanism to assert a RESN relating to the pedigree status between a child and their parent.

So your interpretation of the ADOP.RESN tag in conjunction with a ADOP.FAMC and ADOP.FAMC.ADOP does not provide a mechanism?

I use the following successfully to set a restriction on the information to connect the Adoption of a this individual to the member of the family indicated (Wife individual, Husband individual, Both individuals:

1 ADOP
2 DATE ...
2 RESN ...
2 FAMC @...@
3 ADOP [HUSB | WIFE | BOTH ]

We could update the ADOP tag in the following way to add additional precision to allow for the adoption event to be restricted or the indication of who adopted the individual. :

n ADOP [Y|<NULL>] 
+1 TYPE <Text>                                    
+1 <<INDIVIDUAL_EVENT_DETAIL>>  
+1 FAMC @<XREF:FAM>@                  
+2 ADOP <Enum>                               
+3 PHRASE <Text>                               
+3 RESN <List:Enum>  

The following would indicate the event of an adoption (date, place) has no restriction but the fact that the individual indicated by the HUSB relationship is restricted.  Valuable when a same sex couple does not want to indicate who is the potential biological parent vs the none biological parent.

1 ADOP
2 DATE ...
2 FAMC @...@
3 ADOP HUSB
3 RESN ...

This mother is clearly indicated by the individual pointed to by the parent ASSO. Nothing prevents two ASSO structures, pointing to different individuals, with ROLE of MOTH. Nothing suggested precludes adoption by a same sex couple.

An ADOP.ROLL indicating "Mother" is not neccessary since this indicates a SEX that may not be the individuals current Gender and should be left to the actual indication within the INDI.SEX tag for that person.

GEDCOM 7.0 did not make any substantive changes to FAMC related structures, retaining the GEDCOM 5.4+ behaviour. It is perfectly reasonable to question aspects of a 27 year old design which have not seen consistent or robust implementation by developers (as exemplified by prolific use of _FREL and _MREL tags).

While I agree that it is perfectly reasonable to question any structure regardless of its age (27 years or 1 Day), I'm not convinced that _FREL and _MREL as part of the family record or implemented in other ways is the correct answer! As I've indicated, the violation of normalization design is not desirable as well. The overhead of looking at a different structure may or may not be increased, this all depends on how the back-end database is designed and loaded. In my case the entire Individual Record (INDI) is in memory so the overhead is almost zero. On the other-hand if the load uses the FAMC level 1 tag to generate a relationship and no other data from other Events (ADOP.FAMC, BIRT.FAMC, CHR.FAMC and SLGC.FAMC) in considered, then the load should be changed so runtime does not have the overhead.

The removal of the EVENT.FAMC has other implication as well. Conversion of the current EVENT.FAMC in v7.13 and earlier to an EVENT.ASSO in the future can not be directly made without loading the FAM record then extracting the INDI link for that person, then loading the parent INDI record to determine the SEX (if not restricted) so we can indicate MOTH/FATH in the child record.

Norwegian-Sardines commented 11 months ago

I could be convinced of the following for an adoption:

1 ADOP
2 DATE ...
2 PLACE ...
2 FAMC @F1@
3 ADOP BOTH
3 RESN ...
1 FAMC @F1@
2 PEDI ADOPTED
3 PARENT BOTH
3 RESN ...

Where the ADOP.FAMC.ADOP and FAMC.PEDI.PARENT can be [WIFE | HUSB | BOTH]. I'm not sure why "BOTH" is not liked by some individuals!

Birth information would be the same with the only change other than the BIRT tag would be the PEDI tag.

These would be consistent with the current model require minimal changes during conversion and satisfy the need for the FAMC to indicate some form of relationship to the FAM record with the RESN to restrict the information when wanting to hide the birth/biological or adoptive parent.

I've always though of the Birth parents as the biological ones, if birth indicates to some "the people present or given the child at the birth" (mostly needed to hide the world from truths they are not ready for), a new and different Event or Fact should be indicated/created so a specific meaning for the BIRT tag can be established!

tychonievich commented 11 months ago

Discussed in steering committee

Based on 4 voices with 4 opinions, we are rethinking this and maybe not adding it.

Adoption: use INDI.ADOP.FAMC.ADOP to indicate who adopted by. Thus the INDI.FAMC with PEDI ADOPTED does not indicate whether they were adopted by one or both partners; that information is in the adoption event instead.

Foster: make a FAM with the partners who fostered and point to it with the FAMC that has a PEDI.FAMC. Thus the INDI.FAMC with PEDI FOSTER does indicate that each of the partners in the FAM were involved in the fostering.

Birth: have a BIRT.FAMC pointing to a FAM with the parents who were recognized as such at birth. Thus the INDI.FAMC with PEDI BIRT does indicate that each of the partners in the FAM were involved in the birth, and should not be used for step-families where one partner is a birth parent the another is not.

Those are the only three PEDI in the current spec, so the PEDI.PARENT may not be needed at this time.

This begs two questions.

  1. Does this invalidate how any current tool is interpreting the current spec?
  2. How do we communicate these uses – presumably briefly in the spec and in more detail in the technical FAQ?

We welcome additional comments, input, and ideas

chronoplexsoftware commented 11 months ago

Does this invalidate how any current tool is interpreting the current spec?

Inclusion of an associated adoption event or use of ADOP.FAMC.ADOP was always optional. Thus, a GEDCOM writer is free to use PEDI ADOPTED in the same way that you suggest for birth and foster. It does not seem appropriate to restrict that use.

The only time that PEDI ADOPTED does not apply to both partners is if ADOP.FAMC.ADOP indicates otherwise. This special case allows for one adoptive and one birth parent in a family, potentially avoiding the need for two FAM records.