PeWu / topola

Topola – online genealogy visualization
Apache License 2.0
95 stars 28 forks source link

Get parents from FAMS in case they're not married #63

Open lezhumain opened 1 year ago

lezhumain commented 1 year ago

Fixes issue https://github.com/PeWu/topola/issues/62

PeWu commented 1 year ago

Thanks for the PR. I still need to read the code.

Nonetheless, here are my first thoughts. I have always assumed that both FAMS and HUSB/WIFE links are required (also FAMC and CHIL respectively) and if WIFE and HUSB tags are missing, it's a bug in the software that produced the file. Now, looking through the Internet, I see that this may not always be true although most software write bidirectional links.

Here is an interesting discussion: https://genealogy.stackexchange.com/questions/12437/gedcom-indi-famc-vs-fam-chil

The GEDCOM 5.5.5 specification does not mention both links being required.

What's interesting, GEDCOM 7 specs say

If a FAM record uses HUSB or WIFE to point to an INDI record, the INDI record
must use FAMS (p.64) to point to the FAM record. If a FAM record uses CHIL to
point to an INDI record, the INDI record must use a FAMC to point to the
FAM record.

so the opposite is required, i.e. If there is a WIFE link, there has to be a FAMS link but not necessarily the other way around.

Now, let me look at the code in the PR :-)

PeWu commented 1 year ago

For the record, missing WIFE/HUSB links do not mean that the individuals are not married. To express the type of relationship, one should use the MARR.TYPE field with a value like "not married". See GEDCOM specification (page 59): https://webtrees.net/downloads/gedcom-555.pdf

lezhumain commented 1 year ago

Hey mate thanks for the answer, I was really happy to find your repo for my family tree.

For the record, missing WIFE/HUSB links do not mean that the individuals are not married

Yes indeed, I was thinking the other way around, if parents are not married then I believe WIFE/HUSB links will be missing (i think it was the case for me, I'll double check).

I must admit I don't exactly know GEDCOM specs for now, I'm mainly reacting to bugs I saw when trying this beauty. I mainly thought it was fair to get the parents this way if the entry was missing, but now that I look at it again I think this should probably be done at an earlier stage, when parsing the .ged file, to make sure all known parents are set properly.

Let me know what you think, in the mean time I'll read more about GEDCOM out of curiosity.

PeWu commented 1 year ago

I gave a bit of thought to this issue. I'm afraid the change in this form could cause performance problems for large GEDCOM files. Notice that each time getParents() is called, the whole GEDCOM file is scanned searching for individuals with a certain FAMS entry. And getParents() would be called many times during the processing of a single ancestors chart, including on all transitions. Currently, topola works pretty well with thousands of individuals and I would like this to stay true.

Here is what I would do instead. I'm assuming your intention is to fix topola-viewer to work with your GEDCOM files. I would leave the topola library unchanged but fix the input that it receives, as you noticed that it could be done in an earlier stage. In particular, there is a function in topola-viewer named normalizeGedcom() in gedcom_util.ts that preprocesses the GEDCOM file before the data is handed over to the topola library for rendering. Currently, it sorts spouses and children. I think it is a good place for one more normalization – find all the FAMS links that do not have corresponding WIFE/HUSB links and add the missing links. Again, I would like to make sure this process does not make loading data significantly slower.