FamilySearch / GEDCOM

Apache License 2.0
153 stars 20 forks source link

Versioning does not specify if changing cardinality is minor or major #474

Open tychonievich opened 1 month ago

tychonievich commented 1 month ago

The specification's Guide to Version Numbers is silent on the topic of cardinality changes. Going from plural :M} to singular :1} or from optional {0: to required {1: would be a major, not minor, change because it would make previously-valid files become invalid; but what about the changes in the other direction?

This has come up several times in steering committee conversations, most recently when discussing https://github.com/FamilySearch/GEDCOM-registries/pull/51, and should be decided and documented. Whether that documentation should be put into the specification itself or on https://gedcom.io is not clear to me.

tychonievich commented 1 month ago

I think singular-to-plural and required-to-optional can be minor changes, but that sometimes they require changes to the definition of structures and that those changes have to be done carefully to ensure they don't change structure meanings.

The rest of this comment consists of my notes from considering dozens of examples to arrive at this conclusion.


  1. Can required structures (cardinality {1:1} or {1:M}) become optional (cardinality {0:1} or {0:M})?

    1. Some required structures are the only standard substructure of their superstructure.

      • g7:HUSB and g7:WIFE only contain the g7:AGE substructure.
      • g7:ord-STAT contains only a g7:DATE-exact substructure.

      Making these optional as a minor version makes sense if additional substructures are added.

    2. Some structures' definitions include a reference to their required substructures.

      • g7:HEAD-PLAC is defined as a placeholder for g7:HEAD-PLAC-FORM
      • CHANGE_DATE is defined as "The date of ...", meaning its g7:DATE-exact substructure.
      • g7:INDI-EVEN is defined in part with "Each EVEN must be classified by a subordinate use of the TYPE tag."

      I don't see a way to make these optional, but perhaps there could be some kind of clever re-wording of the definition that allows that?

    3. Some required structures provide information applications may depend on.

      • g7:FILE's g7:FORM provide a media type, which applications might require to handle the file correctly.
      • g7:SLGC requires a g7:FAMC because it represents an ordinance relating a child to parents.

      I think that this case suggests we should be cautious about making a change, but such changes are still minor.

    4. Some required structures appear to be optional in functonality.

      • g7:REPO's g7:NAME is one of may substructures describing the repository; the other fields are useful without the name.
      • g7:TRLR is {1:1} and has no meaning besides "end of file", which is redundant with other file transmission information.

      Changing these to plural is clearly a minor change.

  2. Can singular structures (cardinality {0:1} or {1:1}) become plural (cardinality {0:M} or {1:M})?

    1. Some singular structures are defined in ways that are inconsistent with plurality; thus they can't be made plural without changing their definition.

      • g7:HEAD-PLAC-FORM is defined as "Any PLAC with no FORM shall be treated as if it has this FORM." It is functionally inconsistent to have two conflicting rules of this type.
      • g7:FORM is the media type of a file; while some file types do have multiple media types (e.g. text/javascript is also known as application/javascript, text/x-ecmascript, and several others), these are either synonyms or their combination is invalid.
      • g7:AGE is defined as "The age of the individual at the time an event occurred, or the age listed in the document." It is historically strange to say a person had two different ages at the time of a single event, but reasonable to say that a source document lists conflicting ages.

      I think these can be changed as a minor update, but only if the definition is updated in such a way that a single value has the same meaning. For example, AGE could be updated to start "The age, or one of several possible ages, of the individual ...".

    2. Some singular structures are embedded in fundamental architectures of applications.

      • g7:FAM-HUSB and g7:FAM-WIFE are singular, and many current visualization tools will break if that changes.
      • Relational databases may use join tables for plural structures, requiring architectural changes for any plurality update.

      I think that this case suggests we should be cautious about making a change, but such changes are still minor.

    3. Some singular structures have different ways of handling plurality.

      • Changing g7:FAM-HUSB to {0:M} would conflict with the standard's statement "Family structures with more than 2 partners may either use several FAM records or use ASSOCIATION_STRUCTUREs to indicate additional partners."
      • Changing g7:PEDI to {0:M} would conflict with the standards permission of multiple g7:FAMCs, each with a g7:PEDI, to represent the same data. Note the steering committee proposed this cardinality change in https://github.com/FamilySearch/GEDCOM/issues/339 and https://github.com/FamilySearch/GEDCOM/pull/274.
      • g7:PHRASE is defined as "Textual information that cannot be expressed in the superstructure due to the limitations of its data type," which is fully compatible with plurality. However, a plural g7:PHRASE offers no obvious gains, as the text of the several PHRASEs could be combined into one PHRASE; presumably if it were plural then some guidance on choosing the number of PHRASEs should be added..

      Such a change introduces multiple ways to represent the same data, but is still backwards compatible and hence minor.

    4. Some singular structures could become plural with no change in their definition.

      • g7:RELI is defined as "A religious denomination associated with the event or attribute described by the superstructure." Having this plural would require no change to the definition, and would add useful expressiveness for interfaith events.

      Changing these to plural is clearly a minor change.