FamilySearch / GEDCOM

Apache License 2.0
153 stars 20 forks source link

Adding mtDNA and yDNA data support to GEDCOM 7.x ? #119

Open atom888888 opened 2 years ago

atom888888 commented 2 years ago

Discussed in https://github.com/FamilySearch/GEDCOM/discussions/118

Originally posted by **atom888888** March 4, 2022 I would propose that GEDCOM 7.x add the ability to store mtDNA and yDNA data in the file format, much like AncestralQuest program does. https://ancquest.com/index.htm atDNA (Autosomal DNA) would have like 700,000+ rows for a test kit result per individual and per test, so that would make the file VERY large, so I can understand NOT wanting to have that in the GEDCOM format, but mtDNA and yDNA data are very small datasets. They have things like: - mtDNA - yDNA - testing company - test date - haplogroup - testID - STR yDNA markers - for mtDNA HVR1/HVR2 mutation list. Here is a screenshot with what their UI looks like to give you an idea: yDNA ![image](https://user-images.githubusercontent.com/80556323/156853526-36a49180-d79a-4bed-92ce-9bf774d49b05.png) mtDNA ![image](https://user-images.githubusercontent.com/80556323/156853546-2082d525-d0be-42c0-b4b8-ca16662c2770.png)
dthaler commented 2 years ago

GEDCOM Steering Committee discussion 3/15/2022: one option would be to add a MEDI enum value to the table at https://gedcom.io/specifications/FamilySearchGEDCOMv7.html#enum-MEDI for DNA to label an external file reference as being a DNA report of some type. Are there other options that should be considered for 7.1? Or 8? We want to get input from multiple implementers before making a decision.

jl5000 commented 1 year ago

One of things I am currently representing as a general fact is the number of shared centimorgans between two people. Perhaps this could be incorporated into the ASSO structure? I also like the idea of using MEDI to link to DNA files.

funwithbots commented 1 year ago

I've been exploring this topic for a few days. Pending an official addition to the spec, I'm considering the following structures to implement DNA features in my own software and gedcom files.

  1. Create a DNA test INDI.EVEN with TYPE DNA Test to capture the details about a specific test. Individuals can and have taken multiple tests. The meta data for a test and a FILE link to the artifacts provided by the testing company would be useful to keep with the event.
  2. Create a INDI.FACT to capture haplogroup information. The DATE attribute would function as a timestamp of sorts since the haplogroup determinations can change over time as new haplogroups are discovered and the haplogroup tree becomes more specific. They can also change as a result of more detailed test (e.g. the Y-37 test from ftDNA may give a different estimation than Y-111 or Y-700). Not that the DNA changed but the analysis and categorization are not static.
  3. Creating either an SNOTE or INDI.EVENT with TYPE DNA Match for potential matches. I'm not sure how structured I'll keep this data yet so I may start with an SNOTE and see how that goes.

I would love to see official support for this in a future release.