agiorguk / gemini

Resources relating to the UK Gemini metadata profile
5 stars 3 forks source link

DD3 R13. Add a GEMINI sub-element equivalent to LI_Lineage source #47

Open PeterParslow opened 3 years ago

PeterParslow commented 3 years ago

Add a GEMINI sub-element equivalent to LI_Lineage.source, in order to provide a better way for a dataset that is derived from other datasets to acknowledge its sources. This may require redefining the existing GEMINI Lineage element to be “Lineage statement”. Note, the ISO 19115 model for this is quite complex, with the most likely path being that LI_Lineage.source.LI_Source.sourceCitation.CI_Citation.identifier.MD_Identifier.code matching the Resource identifier of the source – possibly by both carrying Anchors to the same URL. Note: ISO 19115 says “Either the “description” or “sourceExtent” element of LI_Source must be documented”, so it would be necessary to provide one of them in addition to the ‘identifier.code’ Another possibility is for the metadata of the source (i.e. the dataset that expects to be reused!) to provide a DOI or a full LI_Source object which can be referenced by gmd:source xlink:href. At present, there is a hint about DOI use hidden in the encoding guidance for Alternative title. Consider whether Alternative title or Resource identifier is the more appropriate place to put a DOI that re-users should use.

Source document
Definition information about the source data used in creating the resource
Obligation Conditional (within the context of Lineage): mandatory if statement not provided
Occurrence Multiple
Data type CharacterString
Domain Lineage source itself has two sub-elements both of which need to be populated: see below
Other Comments Should match the Resource identifier of the cited source dataset. It is intended so that people assessing the dataset can easily find the source datasets that were used to produce it.

(table in the table)

Description Code
Definition Description of the source data Reference of the source data
Obligation Mandatory within a Lineage source Mandatory within a Lineage source
Occurrence Single Single
Data type CharacterString CharacterString
Domain
Comment Matching the Resource identifier of the source

Corresponding element in other standards

Standard Name Comparison
ISO 19115:2003 LI_Lineage.source.LI_Source.sourceCitation.CI_Citation.identifier.MD_Identifier.code Identical
ISO 19139:2007 gmd:LI_Lineage/gmd:source/gmd:LI_Source/gmd:sourceCitation gmd:CI_Citation/gmd:identifier/gmd:MD_Identifier/gmd:code Identical
Schema.org none

Encoding guidelines To be added to the encoding guidelines of the Lineage element.

      <gmd:lineage>
        <gmd:LI_Lineage>
          <gmd:statement>
            <gco:CharacterString>derived from OS imagery</gco:CharacterString>
            <gmd:source>
              <gmd:LI_Source>
            <gmd:description>
         <gmd:CharacterString>OS MasterMap Imagery Layer</gmd:CharacterString>
            </gmd:description>
                <gmd:sourceCitation>
                  <gmd:CI_Citation>
                    <gmd:title xsi:nil=”true”/>
                    <gmd:date xsi:nil=”true”/>
                    <gmd:identifier>
                       <gmd:MD_Identifier>
                         <gmd:code>
                            <gco:CharacterString>OS MasterMap Imagery Layer</gco:CharacterString>
                         </gmd:code>
                       <gmd:MD_Identifier>
                    <gmd:identifier>
                  <gmd:CI_Citation>
                <gmd:sourceCitation>
              </gmd:LI_Source>
            </gmd:source>
          </gmd:statement>
        </gmd:LI_Lineage>
      </gmd:lineage>
Sgaff commented 3 years ago

Do you mean resource identifier Peter, or resource locator? Seems like we should be talking about resource locator here if it's possible. The code from Resource Identifier can be meaningless if not accompanied by the codespace...

PeterParslow commented 3 years ago

The ISO 19115 model for _LISource uses "citation" which I am sure means citing the source by identifier (or by title etc). I do agree that the identifier code needs to be accompanied by identifier codespace.

Even 19115-1:2014 describes the source role & _LISource class as "information about the source used". In 19115-1, the _LISource information block includes sourceMetadata as well as sourceCitation. We could encourage inclusion of the source's Resource locator, but that would presumably sit within the sourceMetadata, which isn't available in 19115:2003?