TheGeneGenieProject / GeneGenie.Gedcom

A .Net library for loading, saving, working with and analysing family trees stored in the GEDCOM format.
GNU Affero General Public License v3.0
53 stars 22 forks source link

Original Gedcom Record ID Missing #75

Open sinistrius opened 4 years ago

sinistrius commented 4 years ago

Gedcom uses record identifiers enclosed in @ signs, e.g. @I123@ for an individual or @R321@ for a repository. They usually never change and are therefore suited as permanent identifiers. However in the GedcomDatabase tree objects there seem to be no such IDs. I find XRefID properties instead, but can't link them to anything the original file. So, where are the original record IDs?

fire-eggs commented 4 years ago

One of the flaws of David Knight's Gedcom.NET (on which GeneGenie is based) is that the original ids are all "translated". There is no copy of the original record IDs, only the XRefID values.

To add to the problem, the ids are not translated in any sort of human-logical order. They are reassigned in the order the parser encounters ids. For example, testing with a GEDCOM file I have, the first individual is assigned XREF6, the first family XREF1, the first note XREF3.

The only way to change this behavior is to modify and rebuild the source. In GedcomRecordReader.cs, method ResetParse(), there is this chunk of code:

// always replace xrefs
_xrefCollection.ReplaceXRefs = true;

Change true to false and you might (I haven't tested!) preserve the original ids.

RyanONeill1970 commented 4 years ago

Thanks for answering this @fire-eggs , I was intending to reply but you seem to know more than I do! I do recall that the references are not kept.

I would love to recreate a pure .Net version from the group up, with tests first and things like this fixed but it will require too much effort for a part time endeavour. If Ancestry.com were willing to back it...

@Sinistrius does that answer the question for you? I appreciate it is not the answer you want.

fire-eggs commented 4 years ago

I've poked at a few Gedcom projects, including Gedcom.NET / GeneGenie. I've learned a lot and borrowed a bunch of code for my own solution. Of course, there are bugs and limitations, which I've not tackled in my "copious free time".