SEMICeu / iso-19139-to-dcat-ap

Reference XSLT-based implementation of GeoDCAT-AP
European Union Public License 1.2
14 stars 12 forks source link

dct:identifier left blank when gmd:MD_Identifier is a gmx:Anchor? #61

Open cnlarsen opened 5 days ago

cnlarsen commented 5 days ago

There seems to be a difference in how identifiers are mapped depending on whether they are written as a gco:CharacterString or a gmx:Anchor element. Compare the results in the files I've attached.

Anchor DCAT.txt Anchor ISO.txt CharacterString DCAT.txt CharacterString ISO.txt

The metadata for the Anchor example can also be found here: https://geodata-info.dk/srv/api/records/4af59a21-5c6c-4a5d-8a8f-452c9a0f093b/formatters/iso19139?output=xml

And for the Character example here: https://geodata-info.dk/srv/api/records/1a4dd29a-3512-4b47-9cb8-790124668f1e/formatters/iso19139?output=xml

The dct:identifier object is created for both, but in the Anchor example it is left empty, whereas in the CharacterString example the MD_identifier is put there.

However, in the Anchor example, the MD_identifier is present at the top level of the dataset (rdf:Description rdf:about="https://geo.data.gov.dk/dataset/4af59a21-5c6c-4a5d-8a8f-452c9a0f093b") and in the CatalogRecord, linking the CatalogRecord to the dataset. This is not the case for the CharacterString example.

Is this working as intended? I'm not an expert on ISO metadata or the finer details of how the mapping works, but I would assume that, ideally, a combination of the features I've listed above (mapping to dct:identifier, to the dcat:dataset/rdf:description level and the CatalogRecord) would be the result regardless of whether an Anchor or CharacterString element is used.

NielsHoffmann commented 4 hours ago

I had a look through the xslt (out of curiousity :-) ) and I think this happens because the value for dct:identifier is only matched from within gmd:identificationInfo tag (in the section from xsl:template name="UniqueResourceIdentifier in the xslt) and there a specific search for CharacterString is done and it is not picking up the value from the anchor tag.

correction. It is actually picked up as the identifier of the catalogRecord

There is another part of logic in the xslt that picks up the anchor to add it to the rdf:Description rdf:About)

I think it would be nice if the xslt would be more robust to pick this up. Although technically speaking dct:identifier is not mandatory in de GeoDCAT-AP profile.