joeyaurel / python-gedcom

Python module for parsing, analyzing, and manipulating GEDCOM files
https://gedcom.joeyaurel.dev
GNU General Public License v2.0
154 stars 39 forks source link

`get_birth_data` and `get_death_data` returning last `DATE` and `PLAC` found #21

Open damonbrodie opened 5 years ago

damonbrodie commented 5 years ago

get_birth_data right now will iterate through all BIRT records and then retrieve the DATE, PLAC and Sources for each BIRT.

This is an example record - the individual has two BIRT records with different values.

1 BIRT 2 DATE 25 Jan 1780 2 PLAC Liverpool, Queens, Nova Scotia, Canada 1 BIRT 2 DATE 1781 2 PLAC Nova Scotia, Canada

Currently the logic in python-gedcom is to iterate through each BIRT and then assign any found DATE and PLAC to the date and place variables. In the above example it would return "1781", "Nova Scotia, Canada"

Typically the first entry in the gedcom file is the "Preferred" entry - At least this is how it done with Ancestry.com.

I wonder if we should return the FIRST BIRT record we find instead of the LAST one? I think returning all the Sources for all BIRT elements is fine as is.

BTW, all of the above applies for DEAT too. If we agree to make this change then I will submit a PR.

KeithPetro commented 5 years ago

While I agree that it's typical for the first BIRT to be the preferred, it's also not universal. It would perhaps be best if get_birth_data/get_death_data returned all birth/death records. However I am unsure how get_birth_year and get_death_year should act in accordance with this.

damonbrodie commented 5 years ago

Changing get_birth_data to return all records would be an API change (it returns only one "set" right now).

The worst thing about it right now is that it can pull the date from one BIRT record and maybe the location from potentially another BIRT record. I feel this is not a good idea.

joeyaurel commented 5 years ago

I think it would be the best if the methods get_birth_data and get_death_data would return an list containing all available data retrieved from the GEDCOM file.

get_birth_year and get_death_year would then return the year of the first data tuple found in get_birth_data or get_death_data to ensure and keep the conformity of having the first BIRT and DEAT dates to be the preferred one.

Alternatively to that get_birth_year and get_death_year could return a list containing all years. The developer then could decide whether to use the first or last year.

How does that sound?

prism44 commented 5 years ago

New to this forum:

Nick, I agree! To be consistent with other calls ( get_marriages, for example), if there exists any number of entries, then returning the full set of BIRT's and DEAT's would make sense.

Some GED files have a custom TAG that indicates the preferred MARR, BIRT, DEAT, etc. Trying to catch custom tags is a slippery slope.