Closed lnielsen closed 8 months ago
at TUW, we've had a quick shot at the export formats required for OpenAIRE, maybe this helps a bit:
https://gitlab.tuwien.ac.at/fairdata/invenio-config-tuw/-/blob/v2021.2.2/invenio_config_tuw/config.py#L223-239 https://gitlab.tuwien.ac.at/fairdata/invenio-config-tuw/-/blob/v2021.2.2/invenio_config_tuw/oai.py
i think it's not complete (e.g. the datacentre symbol is always set to the configured value, which might not be correct every time?), but probably OK for us for now.
ad datacentre symbol: this seems to be related to the client username, as per @slint's message on discord.
relevant links: information on datacite clients, and client lookup per prefix
here's some notes that we've taken so far:
According to the OpenAIRE guidelines, we need to support the oai_datacite
metadataPrefix with our OAI-PMH endpoint.
This is one of both DataCite prefixes (datacite
, and oai_datacite
), according to the DataCite support page.
The "easier" variant seems to be datacite
, which is described to be the same as the DataCite XML without alterations.
Example: https://zenodo.org/oai2d?verb=ListRecords&metadataPrefix=datacite&set=openaire_data
In contrast, oai_datacite
seems to be enriched in the sense that it has additional requirements on the listed fields: https://guidelines.readthedocs.io/en/latest/data/application_profile.html#d-applicationprofile
Also, it seems to have a slightly different structure (metadata wrapped in payload.resource
, plus extra fields).
Example: https://zenodo.org/oai2d?verb=ListRecords&metadataPrefix=oai_datacite&set=openaire_data
Note: We can likely reuse the values listed by DataCite's OAI-PMH service (metadataNamespace
, schema
) for the configuration of our own OAI Server: https://oai.datacite.org/oai/?verb=ListMetadataFormats
Note: The OpenAIRE guidelines also mention that we should have a set with setName
"OpenAIRE" and setSpec
"openaire_data".
Since set support in Invenio-OAIServer is still WIP, this will have to wait.
@max-moser we're starting work on adding DataCite formats to OAI-PMH at https://github.com/inveniosoftware/invenio-rdm-records/issues/880.
The implementation you have in the TU Wien repo looks correct and would be the obvious starting point. Is it ok with you if we blatantly copy-paste the code (and of course keep the TU Wien copyright, authorship, etc.) and change accordingly to make things configurable?
why yes of course!
done