FamilySearch / gedcom5-conversion

Utilities for GEDCOM 5.5 to GEDCOM X Conversion
Other
32 stars 22 forks source link

Duplicate SOUR references aren't recognized #6

Open jralls opened 12 years ago

jralls commented 12 years ago

The GeditCom Torture Test cites the source for most events similar to this: 2 SOUR @SOURCE1@ 3 PAGE 42 3 DATA 4 DATE 31 DEC 1900 4 TEXT Some number of children source text. 3 QUAY 3 3 NOTE Am number of children source note.

Gedcom5-conversion discards the TEXT and NOTE tags, and the remaining data are the same for all of the citations, but gedcom5-conversion creates a separate instance of 
```xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#" xmlns:ns4="http://purl.org/dc/terms/" xmlns:gx="http://gedcomx.org/" xmlns:gxc="http://gedcomx.org/conclusion/v1/" rdf:ID="SOURCE1-1">
    <ns4:partOf>
        <rdf:value>descriptions/SOURCE1</rdf:value>
    </ns4:partOf>
    <ns4:description>
        <rdf:value>42</rdf:value>
    </ns4:description>
</rdf:Description>

for each one. This is a colossal waste of space. While the torture test is an extreme case, it is not at all uncommon with real data to need to reuse a citation for several events.