amandasaurus / gedcompy

Python library to parse and work with GEDCOM (geneology/family tree) files
GNU General Public License v3.0
39 stars 18 forks source link

NotImplementedError #20

Open damonbrodie opened 7 years ago

damonbrodie commented 7 years ago

I just ran the sample code against my gedcom file (exported from Ancestry) and I get the following:

Traceback (most recent call last): File "./ingest.py", line 4, in gedcomfile = gedcom.parse("myfamilytree.ged") File "build/bdist.macosx-10.12-intel/egg/gedcom/init.py", line 760, in parse File "build/bdist.macosx-10.12-intel/egg/gedcom/init.py", line 725, in parse_filename File "build/bdist.macosx-10.12-intel/egg/gedcom/init.py", line 778, in __parse NotImplementedError: Born in Milton, he was a son of Vera (Wolfe) Coombs and the late Joseph Coombs.

If willing to work on this issue then I am willing to share my gedcom file privately via email.

damonbrodie commented 7 years ago

It seems that part of the GEDCOM 5.5 spec has not been implemented. According to that spec:

http://homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gcch1.htm

Leading white space (tabs, spaces, and extra line terminators) preceding a GEDCOM line should be ignored by the reading system.

In my gedcom file I have the following excerpt:

3 DATA 4 TEXT t is with great sadness we announce the passing of XXXXX on Monday, June 1st, 2015, at home.

Born in Milton, he was a son of XXXXX and the late XXXXX.

Following high scho 5 CONC ol, he enrolled in a two year program at XXXXX Vocational School. He then acquired employment with the Best Yeast Plant in XXXXX, and on the closure of the plant, he began a new career with Bow

According to the spec the lines from "4 TEXT" until the line before 5 CONC should be considered one line since the line terminators should be ignored.