francoisforster / gedcom-cleanup

A simple Kotlin library that compares GEDCOM files, cleans them and performs limited validation
7 stars 0 forks source link

Compare fails on dates in Ancestry exports #3

Closed ennoborg closed 2 years ago

ennoborg commented 2 years ago

GEDCOM files exported from Ancestry have non standard dates, meaning that all month names are written in full, so JAN becomes January, and so forth. This means that all date comparisons fail, even for MAY, which is exported as May.

In the attached file, you can also find other non standard constructs, like UID for _UID and _FSID for _FSFTID. ancestry.txt GEDCOM uploaded as TXT to please GitHub. :-)

francoisforster commented 2 years ago

Should be fixed with https://github.com/francoisforster/gedcom-cleanup/commit/5aa1b43d542558c0564795f8e68eb9805ae472ea

ennoborg commented 2 years ago

Not here. LocalDate expects a date in the user's local language, which in my case is Dutch. Ancestry uses the long US (English) form.

francoisforster commented 2 years ago

Let me know if https://github.com/francoisforster/gedcom-cleanup/commit/6f403184fc7286c71ffc496f0ffded2208483ea2 fixed it

ennoborg commented 2 years ago

It did, but Ancestry wouldn't be Ancestry, if they didn't trigger yet another one:

Birth different: Event(date=ABT 1836, place=Zwolle) vs Event(date=about 1836, place=Zwolle)

This gives me the impression that their date export is completely non-standard, and they spell out everything.

ennoborg commented 2 years ago

... and that's confirmed by other lines, where Ancestry exported 'before' where BEF is the rule.

francoisforster commented 2 years ago

Since this is non-standard gedcom, I'm going to suggest you pre-process the ancestry exported file, or the other file to change ABT with about, BEF with before, etc... Closing the issue as the month comparison is fixed.