Closed cessda-bitbucket-importer closed 2 years ago
Original comment by Matthew Morris (GitHub: matthew-morris-cessda).
@TainaFSD Do you have an example of a record where this is the case?
Original comment by Taina Jääskeläinen.
I think that ADP uses this format.
Original comment by Matthew Morris (GitHub: matthew-morris-cessda).
Given that ADP's OAI-PMH endpoint is not responding, and I can't find another example to work with, I'm putting this on hold for now.
2020-11-30 12:25:51.691 ERROR (LocalHarvesterConsumerService.java:68) - [ADP] ListRecordHeaders failed: eu.cessda.pasc.oci.exception.XMLParseException: Parsing https://www.adp.fdv.uni-lj.si/v0/oai?verb=ListIdentifiers&metadataPrefix=oai_ddi25 failed: eu.cessda.pasc.oci.exception.HTTPException: Server returned 503
Original comment by Taina Jääskeläinen.
Check what ADP OAI-PMH currently has for language.
Original comment by John Shepherdson (GitHub: john-shepherdson).
@matthew-morris-cessda ADP's endpoint is responding. See https://www.adp.fdv.uni-lj.si/v0/oai?verb=ListIdentifiers&metadataPrefix=oai_ddi25
Original comment by Matthew Morris (GitHub: matthew-morris-cessda).
I’ve decided to strip language codes like en-GB
to remove all characters after the dash. This results in en-GB
being transformed to en
.
Original comment by Taina Jääskeläinen.
Issue #334 was marked as a duplicate of this issue.
Original comment by Matthew Morris (GitHub: matthew-morris-cessda).
Fixed in <link to pull request removed>.
Original report on BitBucket by Taina Jääskeläinen.
As some countries use language and country code combinations, can CDC read only the first two digits in the xml:lang attributes and ignore the rest? The first two-digit part is ISO 639-1.
Easier than make SPs change their legacy metadata. EQB also has two elements where this combination is needed, to distinguish between questions asked in the UK and in Australia, for instance.
See also #230.