dthaler / gedcom-citations

GEDCOM extensions for citations
1 stars 0 forks source link

Observations regarding citations in the current GEDCOM specification and the potential inclusion of templates. #14

Open Norwegian-Sardines opened 9 months ago

Norwegian-Sardines commented 9 months ago

I wanted to add to this group some initial thoughts that I had previously discussed in another thread not found in this group.

The basic issue is that the GEDCOM Specification does not provide the appropriate data tags to support all of the needs of the genealogy software industry to build well-written citations based around the Evidence Explained ("EE") style of citation building. "EE" has become the de-facto standard way to create citations for genealogical documents, although many individuals have indicated to me in non-scientific surveys that it is too cumbersome to implement in their own documents, resulting in their lack of implementation.

I'm not an expert on "EE" and since I've retired and moved away I don't have regular access to the "EE" book anymore at my library, therefore I can't be too specific, but at this stage it does not matter since I want the system to allow any data to be stored to become a well-written citation, I'm just looking at the basic outline!

Three Possible Solutions to the "Citation" Concern that some application may have.

1) The GEDCOM Specification has, for the most part, always included the basic elements needed to cite the source of a fact! These elements are located in 3 different record types within the GEDCOM specification. Continue with this design and add any fields/tags needed to be more inclusive of all basic citation designs. this would require a review of citation to determine the missing tags.

2) Implement some form of a template solution currently found in software solutions today. This would include the following GEDCOM changes

a) Create an Official Template Record (Including a new template substitution language) b) Augment the Source_Record to accommodate template data. c) Augment the Source_Citation structure to accommodate template data. d) Augment the Repository_Record structure to accommodate template data.

3) Implement a more robust template structure using two new record types

a) Create an Official Template Record (Including a new template substitution language)
b) Create a Citation_Record to house citation specific data, rather than using any current citation based record types and structures.
c) Leave the Source_Citation Structure as is
d) Leave the Source_Record as is
e) Leave the Repository_Record as is

I will expand on these concepts in a later entry.

Norwegian-Sardines commented 6 months ago

(Option 1) Above is the simplest and most straight forward upgrade to the current GEDCOM citation model. We know that several program try to utilize the Elizabeth Shown Mills "Evidence Explained" citation standard for genealogy, however most amateur genealogists and family historians find the design confusing. Therefore they either do not use the "EE" model at all or find some way to simplify the structure. I've noted in my research that multiple "Simple Citation" constructs have been introduced by universities and genealogy clubs in the years since "EE" was published.

It is important however to understand that for individuals who publish their work do need to use and understand the methods Mills has outlined and put them into practice. this means that that some form of citation data storage and transfer is required in GEDCOM. the real question is how to implement that in storage and transfer.

Because of the wide breadth of citation model found in the "EE" documentation and the fact that the actual number of specific models can not be counted, utilizing the "EE" concepts will require an infinite set of possibilities and a great amount of study. For example: currently in the Mills book we can not find a citation model to support sources beyond basic Western Countries and may not include some more obscure source types. I personally needed to contact "EE" scholars to determine the correct model for some of my sources in Norway. Therefore it does require learning and understand the basic pattern of 1-2-3 and what information goes into each layer.

To accomplish the tedious part of developing the layout patterns software companies have implemented the use of "Templates". These templates present data entry parts based on the needs of each citation model and assemble the individual parts into an "EE" based citation sentence based on the model selected. This makes it easy for users to build a citation for a source type, IF that source type has a provided template. If a template for a source type is not available the user must either use one the closely fits or build their own, potentially diminishing the value of the "EE" structure.

(Option 2) as described above is what many programs use to implement the "EE" structure. The design works for the independent programs and can build the appropriate citation text within the software. However every program implements the design differently with various levels of output. Some programs capture all the necessary data within the GEDCOM file but do not provide a finished citation of that fact, while others capture the data and provide a finished citation.

THIS IS THE KEY I believe that when transferring data via GEDCOM the most valuable information about the source of a fact is a properly built finished/complete citation that can be used by the receiving program and human reader directly to either include in published works or deconstructed (due to its common design) to find and review the document from where the fact was generated. No other information is really needs by the receiving program or user, it is only an added bonus that the individual bits of information could be included as separate fields in the GEDCOM.

It is this Key Point that tells me that the first addition to a future GEDCOM Design should include a place to transfer the complete/full citation. This can be done with a "Citation-Record" containing a complete citation made from the individual parts found in other parts of the GEDCOM -or- with a less invasive addition as a single new tag to the "Citation_Structure". Because no templates are passed in the GEDCOM, and with a complete citation No individual data-points are required (but optional) therefore a new template substitution language is not required, using a minor release!!!

(Option 3) goes beyond the Key Point above and implements a more robust Citation_Record and Template construct, but needs a lot more design time and thought.

albertemmerich commented 1 month ago

I disagree to your key point, GEDCOM should transfer the complete/full citation. I see GEDCOM should have separate structures for the basic data, taken from the source in a citation record, and on the other side rules to put together the basic data to a text using a template. The basic data are taken from the source, and therefore are independant from the way how to show them when publishing. The template itself will be language dependant, if you cite a page or record entry number in your source, the full citation text will use "volume", "chapter", "page", and "record number" in English or "Band", "Kapitel", "Seite", and "Eintrag Nummer" in German, and many other languages. So I see language independant citation data (translations may be included as add-on) on the one side, these should be transferred by GEDCOM. And I see language-dependant templates, taking the citation data and building the citation text used when publishing.

If I only had the text created by the template module, I would have a lot of difficulties to separate the language dependant template part from the original citation part when translating to another language.

So the better way would be to have separate citation records, and template records giving the rules how to put together the pieces of the citations.