frizbog / gedcom4j

Java library for reading/writing genealogy files in GEDCOM format
http://gedcom4j.org
53 stars 36 forks source link

Event tags sometimes have text instead of Y | Null, parser drops data #62

Closed frizbog closed 10 years ago

frizbog commented 10 years ago

Some tools write out certain event tags such as DEAT, BURI, and CREM with additional comment/textual data on the line after the tag. Strictly speaking, this is not in conformance with the GEDCOM spec, but since Family Tree Maker does this, failing to support this is really cutting out a huge use case.

So support needs to be added to grab this additional data on the line, when it exists, and put it somewhere, such as in notes. Support for CONT and CONC tags after this text is implied here as well.

frizbog commented 10 years ago

I've just checked in some code (it's building in Jenkins now...). Evidently there was already a field in the base Event class, which is the superclass of both FamilyEvent and IndividualEvent objects (but not LDSOrdinances). The parser was already partially implementing this, but the part it omitted was needed, and the part it did, it did incorrectly :-)

So I have modified the parser (but not the data model) so that the string value after one of these event tags in the input is treated as description text unless the value is either 1) Blank/Null, or 2) the single uppercase character "Y", in which case the value is set in the yNull field.

I still need to work out what to do, though, when writing this data out. I really don't want to emit GEDCOM files that don't conform to the spec. Once I have a solution for this, I will strike a release and put a JAR file out there for download, but for now, someone who really wants this now will need to build from source.

frizbog commented 10 years ago

I have decided not to write non-standard data. Instead, I have enhanced the validators to flag events with non-null/non-empty description fields as a validation finding. This way, users can continue to parse non-standard files (such as those FTM writes) and work with them with no trouble, but if they want to write the files out, they need to either move that data in the object model as they see fit (such as to a Note), or to suppress the validator on write and accept the fact that the description is not going to be written out in the output file.

I'm considering this closed, and this evening I will be releasing a new build that includes this fix.