SEMICeu / iso-19139-to-dcat-ap

Reference XSLT-based implementation of GeoDCAT-AP
European Union Public License 1.2
15 stars 9 forks source link

Missing transformation when Anchor element is used for the title of the specification or the thesaurus #4

Closed AntoRot closed 3 years ago

AntoRot commented 3 years ago

I performed some tests of the XSLT aligned with the draft of GeoDCAT-AP 2.0.0.

Concerning the conformity, the title of the specification is empty in the RDF file, output of the transformation, when in the INSPIRE XML file that title is encoded using the gmd:title/gmx:Anchor element, as recommended in the INSPIRE TG (see TG Recs C.11 and 1.10).

I.e. metadata in ISO 19139

<gmd:report>
            <gmd:DQ_DomainConsistency>
               <gmd:result>
                  <gmd:DQ_ConformanceResult>
                     <gmd:specification>
                        <gmd:CI_Citation>
                           <gmd:title>
                              <gmx:Anchor xlink:href="http://data.europa.eu/eli/reg/2010/1089">REGOLAMENTO (UE) N. 1089/2010 DELLA COMMISSIONE del 23 novembre 2010 recante attuazione della direttiva 2007/2/CE del Parlamento europeo e del Consiglio per quanto riguarda l'interoperabilità dei set di dati territoriali e dei servizi di dati territoriali</gmx:Anchor>
                           </gmd:title>
                           <gmd:date>
                              <gmd:CI_Date>
                                 <gmd:date>
                                    <gco:Date>2010-12-08</gco:Date>
                                 </gmd:date>
                                 <gmd:dateType>
                                    <gmd:CI_DateTypeCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#CI_DateTypeCode" codeListValue="publication">publication</gmd:CI_DateTypeCode>
                                 </gmd:dateType>
                              </gmd:CI_Date>
                           </gmd:date>
                        </gmd:CI_Citation>
                     </gmd:specification>
                     <gmd:explanation>
                        <gco:CharacterString>Fare riferimento alle specifiche indicate</gco:CharacterString>
                     </gmd:explanation>
                     <gmd:pass>
                        <gco:Boolean>false</gco:Boolean>
                     </gmd:pass>
                  </gmd:DQ_ConformanceResult>
               </gmd:result>
            </gmd:DQ_DomainConsistency>
         </gmd:report>

is transformed into

<prov:Activity>
    <prov:used rdf:resource="https://geodati.gov.it/resource/id/r_basili:52F1BC6F-7597-B988-8702-FAEB036ACBA7"/>
    <prov:qualifiedAssociation rdf:parseType="Resource">
      <prov:hadPlan rdf:parseType="Resource">
        <prov:wasDerivedFrom rdf:parseType="Resource">
          <dct:title xml:lang="it"/>
          <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2010-12-08</dct:issued>
        </prov:wasDerivedFrom>
      </prov:hadPlan>
    </prov:qualifiedAssociation>
    <prov:generated rdf:parseType="Resource">
      <dct:type rdf:resource="http://inspire.ec.europa.eu/metadata-codelist/DegreeOfConformity/notConformant"/>
      <dct:description xml:lang="it">Fare riferimento alle specifiche indicate</dct:description>
    </prov:generated>
  </prov:Activity>

where the title value is empty, nor the related URI is used.

The same occurs with the title of the thesaurus when that title is encoded using the Anchor element but the keyword values are encoded using gco:CharacterString element (I know that this sounds bizarre but this also happens), i.e.

this metadata in ISO 19139

<gmd:descriptiveKeywords>
            <gmd:MD_Keywords>
               <gmd:keyword>
                  <gco:CharacterString>Gestione dell'acqua</gco:CharacterString>
               </gmd:keyword>
               <gmd:thesaurusName>
                  <gmd:CI_Citation>
                     <gmd:title>
                        <gmx:Anchor xlink:href="http://www.eionet.europa.eu/gemet/">GEMET - Concepts, version 2.4</gmx:Anchor>
                     </gmd:title>
                     <gmd:date>
                        <gmd:CI_Date>
                           <gmd:date>
                              <gco:Date>2010-01-13</gco:Date>
                           </gmd:date>
                           <gmd:dateType>
                              <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/codelist/gmxCodelists.xml#CI_DateTypeCode" codeListValue="publication">Pubblicazione</gmd:CI_DateTypeCode>
                           </gmd:dateType>
                        </gmd:CI_Date>
                     </gmd:date>
                  </gmd:CI_Citation>
               </gmd:thesaurusName>
            </gmd:MD_Keywords>
         </gmd:descriptiveKeywords>

is transformed into

<dcat:theme rdf:parseType="Resource">
      <skos:prefLabel xml:lang="it">Gestione dell'acqua</skos:prefLabel>
      <skos:inScheme>
        <skos:ConceptScheme>
          <dct:title xml:lang="it"/>
          <dct:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2010-01-13</dct:issued>
        </skos:ConceptScheme>
      </skos:inScheme>
    </dcat:theme>

where again the title value of the thesaurus is empty.

andrea-perego commented 3 years ago

Thanks for spotting this bug, @AntoRot .

Can you please provide the URL of the records in your examples (or similar ones), so to test the fixes to be done?

AntoRot commented 3 years ago

Thanks @andrea-perego.

A XML record where the title of specification is encoded using Anchor element is available at the URL https://geodati.gov.it/RNDT/rest/document?id=r_basili:399969B4-7CC7-6387-398C-549DD1CC3EA8 (or, as CSW GetRecords request https://geodati.gov.it/RNDT/csw?request=GetRecords&service=CSW&version=2.0.2&resultType=results&outputSchema=http://www.isotc211.org/2005/gmd&outputFormat=application/xml&typeNames=csw:Record&elementSetName=full&constraintLanguage=Filter&constraint_language_version=1.1.0&startPosition=1&maxRecords=10&Constraint=%3CFilter%3E%3CPropertyIsEqualTo%3E%3CPropertyName%3Eidentifier%3C/PropertyName%3E%3CLiteral%3Er_basili:399969B4-7CC7-6387-398C-549DD1CC3EA8%3C/Literal%3E%3C/PropertyIsEqualTo%3E%3C/Filter%3E ).

An example of XML record where the title of thesaurus is encoded using Anchor element but the keywords value is encoded using gco:CharacterString element is available at the URL https://github.com/AgID/rndt-guidance/blob/master/metadata/examples/esempio-dataset-rndt-2.0.xml

andrea-perego commented 3 years ago

Thanks, @AntoRot .

I updated the XSLT via PR https://github.com/SEMICeu/iso-19139-to-dcat-ap/pull/5

The results can be checked here:

andrea-perego commented 3 years ago

Unless there are any objections, I propose to consider this bug as fixed, and close this issue.

AntoRot commented 3 years ago

No objection from my side. Thanks.

andrea-perego commented 3 years ago

Thanks, @AntoRot .