IACR / latex

Latex classes for IACR publications. We will start with the new journal proposal.
8 stars 1 forks source link

extended schema for XMP for PDF #118

Open kmccurley opened 1 year ago

kmccurley commented 1 year ago

The standard XMP schemas are pretty minimal, and omit things like ORCID IDs, affiliations, funding agencies, citations, etc. It would be desirable to extend these schemas to allow us to embed more metadata into the PDF, but that requires us to define a schema for it. According to this document extension schemas are required to be included inline in the PDF. This could become as large as the metadata itself, but standards are standards I guess. Notably, Springer-Nature appear to violate this by declaring extended namespaces as follows:

xmlns:sn="http://springernature.com/ns/xmpExtensions/2.0/"
xmlns:author="http://springernature.com/ns/xmpExtensions/2.0/authorinfo/"                                                                                           

I am unable to locate any schema for these, and the URLs don't resolve to anything (they are not required to, but it is encouraged to make them point at an XSD). It's pretty clear how they use them in their PDFs.

<sn:authorInfo>
    <rdf:Bag>
         <rdf:li rdf:parseType="Resource">
               <author:name>Ngoc Khanh Nguyen</author:name>
               <author:orcid>http://orcid.org/0000-0001-8240-6167</author:orcid>
          </rdf:li>
     </rdf:Bag>
</sn:authorInfo>

I propose that we define our own schema starting from a subset of JATS to promote interoperability, create an XSD for it, and store it at http://iacr.org/ns/xmpExtensions/1.0/. Alternatively we could just use some JATS schema itself and embed something like jats:article-meta, referencing their schema. That would at least cover ORCID IDs, affiliations, and funding. since contrib-group may contain aff and orcid, and funding-group may contain funding sources. I prefer doing this rather than reinventing our own.

It remains to be seen how we would include this in the XMP itself, since it's not clear if we have much control over hyperxmp or xmpincl.

kmccurley commented 1 year ago

It appears that bibliographic citations are also relatively easy, and we could use <ref-list> which is normally part of <back> in JATS. We can use <element-citation> which contains <pub-id> that can have the DOI of a citation and other structured elements like authors, title, journal, etc.

kmccurley commented 1 year ago

I retract what I said about springernature's schema - they included extension schemas inline in their XMP under <pdfaExtension:schemas>.