icatproject / icat.oaipmh

OAI-PMH implementation for ICAT
Apache License 2.0
0 stars 1 forks source link

Update DataCite metadata to expose PaNET terms according to ETN-1 #25

Open paulmillar opened 1 year ago

paulmillar commented 1 year ago

Through PaNET, we now have a common approach for identifying experimental techniques. Therefore, it is now possible to identify through which technique data (in some dataset) was obtained.

The document Working with PaNET terms in ICAT describes how to include PaNET terms within ICAT datasets. It is a "how to", helping research institutes that have deployed ICAT to adopt PaNET.

The document Embedding PaNET in DataCite metadata describes how to embed PaNET terms within a DataCite record describing a dataset. The DataCite metadata record is an XML infoset that is (for example) available through OAI-PMH, where it is consumed by various metadata harvesting services (such as B2FIND and OpenAire).

Currently, the ICAT OAI-PMH interface supports the client requesting a DataCite record about a dataset, but does not appear to adhere to ETN-1.

When looking at src/main/config/oai_datacite_transformer.xsl.example, it appears that there are the following problems with PaNET subject elements currently:

It is desirable that ICAT OAI-PMH interface is updated so that it can provide information about which experimental technique(s) were used to generate the dataset, as described by ETN-1 and the "Working with PaNET terms in ICAT" documents. Please feel free to comment in those documents if anything is unclear.

EmilJunker commented 1 year ago

The src/main/config/oai_datacite_transformer.xsl.example file is only meant to be an example that illustrates the way the XML transformations work. Sites that want to deploy ICAT-OAIPMH are expected to create custom XSLT stylesheets tailored to their needs. The content of these files will depend on the exact metadata format that should be supported (e.g. DataCite with embedded PaNET), and also on the way the information is stored in the ICAT schema (each facility uses and interprets the ICAT schema differently).

RKrahl commented 1 year ago

Emil's answer is of course correct. Nevertheless we should update the example XSLT files to accommodate the schema changes from icat.server 5.0. In particular adopting PaNET through the new Technique class is straight forward and by no means site specific. So it should be in the provided examples.

Thanks @paulmillar for bringing this up, I already had this on my undocumented internal TODO list and indeed, I plan to do this. I just can't make promises on when I'll get around to do it.

paulmillar commented 1 year ago

@RKrahl , I agree. PaNET is not site-specific --- that's really the whole point point of standardisation.

From scanning through the existing XSLT, I would image the changes needed are pretty trivial. I have some experience with XSLT, so may be able help.

However, I do not have access to an ICAT server so would need some sample query XML data (from ICAT) and help testing any proposed changes.

(I, too, do not have a lot of spare time.)

RKrahl commented 1 year ago

In the meanwhile, I figured that it is not so straight forward as I thought. I described the technical details along with a vague idea on how to overcome them in a separate issue, see #27.