IACR / latex

Latex classes for IACR publications. We will start with the new journal proposal.
8 stars 1 forks source link

Extend affiliations macro? #44

Closed jwbos closed 2 years ago

jwbos commented 2 years ago

What are the exact requirements on the affiliations for meta-data? Currently we only provide a name and optionally a ror. I seem to recall, city and country and mandatory: should we ask for this explicitly in the macro and force the user to supply this?

kmccurley commented 2 years ago

Let's remember that there are multiple indexing agents, and they all require their own formats. Crossref and Google scholar are open about what they require. Clarivate Web of Science and Scopus are not. You have to go through an approval process and then they assign you an account in their developer portal. I've been able to find snippets of information about what they require, but they are not open. One site said that scopus like JATS JATS is a (mostly) open standard, defined as an XML schema in various formats and versions. The publishing article tag set is most used, and is documented in different places including NLM where it started.

Note that if we have a ROR id, then we can lookup the address; see nxp data. A ROR ID is golden.

The only real requirement from crossref is either name or ROR. They accept both. Other fields that crossref accepts are optional, but include:

I don't know where the requirement for country really comes from. Scopus asks for country of all editors and authors, presumably so they can evaluate the geographic diversity. This is only for their review, but I don't think it's relevant for reporting metadata.

Editorial Affiliation Details
Names and institutional affiliations – including country – of all members of the
editorial team are required (such as Editor-in-Chief, Editorial Board Members,
Associate Editors, Regional Editors etc.).
Author Affiliation Details
Names and institutional affiliations – including country – and addresses of all
contributing authors are required.

I have been unable to find a specific schema that clarivate web of science uses. I believe they only tell you that if you pass their initial evaluation, but I found this that says what you can get out of web of science:

Corresponding Address (Previously Reprint Address)
The address of the corresponding author. It appears after the corresponding author name. A corresponding address may include:

Corresponding author
Organization
Suborganization
Street
City
State or Province
Zip or Postal Code
If an address has a preferred organization name, then an Expand icon appears before the address.

Addresses
The addresses of all authors as supplied by the source journal.

The number before the address is associated with the author name that appears in the [Author(s](https://images.webofknowledge.com/images/help/WOS/hs_author.html)[)](https://images.webofknowledge.com/images/help/WOS/hs_author.html) field with the same (superscript) number.

Most records of articles published in 2008 or later link an address to an author name via a number in superscript next to the author name.

E-mail Addresses
The e-mail address of the author(s).
kmccurley commented 2 years ago

This is becoming kind of a catch-all for how we need to plan metadata availability indexing services. I intend to write most of this into the paper now. It's hard to figure out exactly what schema clarivate and scopus expect. Crossref and Google scholar have open documents that say what they expect, but Clarivate only engages with a journal once they have been reviewed and accepted for publication. I can only find hints of what Clarivate and Scopus expect. I have found several sources that recommend using JATS for content-delivery to Elsevier prefers delivery of full content via JATS, but Elsevier also crawls HTML content on the web.

In my opinion the JATS XML format is by far the best thought out format, and that is what most of the publishing world is moving to. The JATS format has three main sections:

  1. front matter or head, with metadata about the article. It will be fairly easy to produce this from our meta format. Note that they allow multiple affiliations, and they are referenced in much the same way that we do with references like {1,2}. They allow more fields like department and postal address lines - we could obviously add these to the iacrcc.cls. I think the evaluation process of a journal involves assessing how international it is, so we may want to include address information for affiliations. If we required ROR, then we can recover it via APIs. Perhaps we could write the cls to require certain fields like city and country if the ROR is not supplied.
  2. body of the document. This includes abstract and sections, and this is where it gets difficult because LaTeX has too much variability in what it can produce for a document, so the translation from LaTeX to JATS is potentially lossy.
  3. back matter, consisting of acknowledgements, bibliography, and maybe appendices. We are already producing the bibliography in jats format in the xmp file from meta.py. We may want to add acknowledgements in the metadata of iacrcc.cls.

It's possible to specify floats in a fourth section, or they can be included inline in the body.

Unfortunately the LaTeX community is lagging badly in tools to convert LaTeX to JATS. The best one I have found is pandoc, though there are others that try (latexml and tex4ht). The overall problem with conversion from LaTeX to JATS is that LaTeX is a full programming language, and the intermediate format of pandoc is more like markdown. pandoc handles formulas and displayed equations quite well because it just embeds TeX, but it may have trouble expanding custom macros used in math segments. It doesn't do well with complete tables or figures. I've tried converting several papers to JATS with pandoc, and the results are fairly good but not perfect. I think this may not matter, because I think JATS is only used for indexing and not display.

jwbos commented 2 years ago

I think this is indeed a very difficult problem if we want to solve all at once. I will try and extend the affiliations macro with some additional info which are all optional as a start.

jwbos commented 2 years ago

Fixed.