cessda / cessda.cdc.versions

Issue track and wiki for the CESSDA Data Catalogue
https://datacatalogue.cessda.eu/
Apache License 2.0
0 stars 0 forks source link

Update SND endpoint due to restructuring of the SND catalogue #564

Closed matthew-morris-cessda closed 1 year ago

matthew-morris-cessda commented 1 year ago

The MO IT team recieved an email from SND about changes to their OAI-PMH endpoint.

Hi! We just launched a major restructuring of the SND-catalogue and this will have effects on out OAI-PMH provider. The datasets was previously grouped under a study and now they a represented as standalone datasets with versions. The identifiers for the sets are also changed to use the standard identifier with a type prefix eg: https://snd.gu.se/oai-pmh?verb=ListRecords&metadataPrefix=ddi25&set=subject:ssif:5 (all datasets categorized under Social Science if the swedish version of Field of Research and Development). So you might have to change the configuration for the harvester.

We are also curently rebuilding the metadata exports so we will get more fields into DDI 2.5 and do a complete rewrite of the DDI-L export.

Best regards, Olof

This will require the following changes

john-shepherdson commented 1 year ago

BASE validation results (http://oval.base-search.net/)

Screenshot 2023-06-15 at 10 34 54
john-shepherdson commented 1 year ago

OAI-PMH Validator results

Screenshot 2023-06-15 at 10 44 20
borsna commented 1 year ago

A bit confused by the validation error for the use of : in setSpec in the examples its used in the way we use it to structure sets for subjects, principals etc but in the same palace : is listed as non valid for setSpec http://www.openarchives.org/OAI/openarchivesprotocol.html#Set

matthew-morris-cessda commented 1 year ago

On further investigation, this is reproducible in the OAI-PMH endpoint with a ListRecords request. For 2023-101-1.

...
    <record>
      <header>
  <identifier>2023-101-1</identifier>
  <datestamp>2023-05-24T06:51:22Z</datestamp>
  <setSpec>principal:</setSpec>
  <setSpec>subject:cessda:Politics.Elections</setSpec>
  <setSpec>subject:cessda:Politics.PoliticalBehaviourAndAttitudes</setSpec>
  <setSpec>subject:ssif:5</setSpec>
  <setSpec>subject:ssif:506</setSpec>
  <setSpec>subject:cessda:Politics</setSpec>
</header>
...

Note for the first setSpec that principal: is present.

borsna commented 1 year ago

Ah, got it. was looking on the set listing. this is an error where domain is missing for this principal. will do a fix and push it so principal sets are genererated in the correct way

matthew-morris-cessda commented 1 year ago

See https://snd.gu.se/oai-pmh?verb=GetRecord&metadataPrefix=oai_dc&identifier=2023-69-1

borsna commented 1 year ago

@matthew-morris-cessda fixed the principal: sets now. A principal setSpec should not be generated if the domain for the principal is set in for the dataset.

matthew-morris-cessda commented 1 year ago

Confirmed fixed

image

john-shepherdson commented 1 year ago

See https://github.com/cessda/cessda.metadata.harvester/pull/22

john-shepherdson commented 1 year ago

1,316 records harvested. 0 XML schema violations, 0 constrain violations.