frizbog / gedcom4j

Java library for reading/writing genealogy files in GEDCOM format
http://gedcom4j.org
53 stars 36 forks source link

GEDCOM file with incorrect tree numerotation #87

Closed feloy closed 8 years ago

feloy commented 8 years ago

Some generated GEDCOM files have incorrect numerotation. Below an extract of a file generated by GensDataPro: in the latest lines, a line beginning with 3 is directly following a line beginning with 1. The library does not correclty handle this error and raises an error.

0 HEAD
1 GEDC
2 VERS 5.5
1 SOUR GensDataPro_b
2 VERS 2.9.9.1
1 DATE 9-10-2015
1 CHAR ANSI
1 FILE bug2.ged
0 @S1@ SUBM
1 NAME First Last
1 ADDR An addr
1 CONT 123 City
1 CONT email@me
0 @I643@ INDI
1 NAME First/Last/
2 GIVN First
2 SURN Last
1 CHAN
2 DATE 01 JAN 2014
3 TIME 15:13:09
1 SEX M
1 BIRT
2 DATE 3 JUL 1945
2 PLAC City,AN,BEL
1 BURI
2 DATE 10 JUN 1803
2 PLAC City,NB,NLD
1 FAMS @F1031@
0 @I2627@ INDI
1 NAME First/Last/
2 GIVN First
2 SURN Last
1 CHAN
2 DATE 12 FEB 2004
3 TIME 20:06:58
1 SEX F
1 BIRT
2 DATE 
3 NOTE One note
1 BURI
2 DATE 8 SEP 1869
3 NOTE A note
2 PLAC A place,AN,BEL
1 FAMS @F1031@
0 @F1031@ FAM
1 HUSB @I643@
1 WIFE @I2627@
1 MARR
3 NOTE A note
0 TRLR

Below some modification that permits the library to continue the parsing without error:

--- a/org/gedcom4j/parser/GedcomParserHelper.java
+++ b/org/gedcom4j/parser/GedcomParserHelper.java
@@ -110,6 +112,8 @@ final class GedcomParserHelper {
         if (tree.level == level) {
             return tree;
         }
+        if (tree.children.size() == 0)
+           return null;
         StringTree lastChild = tree.children.get(tree.children.size() - 1);
         if (lastChild.level == level) {
             return lastChild;
@@ -142,8 +146,10 @@ final class GedcomParserHelper {
                 st.tag = lp.tag;
                 st.value = lp.remainder;
                 StringTree addTo = findLast(result, lp.level - 1);
-                addTo.children.add(st);
-                st.parent = addTo;
+                if (addTo != null) {
+                    addTo.children.add(st);
+                    st.parent = addTo;
+                }
             }
         } finally {
             if (bytes != null) {
frizbog commented 8 years ago

I do like the idea of having the parser be more robust, and a bit more tolerant of non-standard/malformed files. I also am fairly sure your solution is sound, although I think I might want to add a warning so the calling code can know that something was wrong but handled. Let me work with your example and see what I come up with, and thanks for the great suggestion.

frizbog commented 8 years ago

Fixed in latest 2.2.3-SNAPSHOT. Includes an error message in the parser errors collection, but allows parsing to continue.