Open jfy133 opened 4 years ago
Thanks for bringing this up @jfy133. This used to be an issue with all bioRxiv citations, but we thought it was resolved https://github.com/manubot/manubot/issues/16. It may be influenced by the preprint date, where older preprints correctly have bioRxiv as the container-title. It does not seem to be directly affected by the new or older style bioRxiv DOIs because this new style DOI from 2019 works:
$ manubot cite --render doi:10.1101/2019.12.20.884551
1. MGMM: an R package for fitting Gaussian Mixture Models on Incomplete Genomics Data
Zachary R. McCaw, Hanna Julienne, Hugues Aschard
bioRxiv (2019-12-23) https://doi.org/ghf6tr
DOI: 10.1101/2019.12.20.884551
Until we resolve this, one workaround is to cite the affected preprints by URL instead of DOI
$ manubot cite --render https://doi.org/10.1101/2020.10.01.322206
1. DamageProfiler: Fast damage pattern calculation for ancient DNA
Judith Neukamm, Alexander Peltzer, Kay Nieselt
bioRxiv (2020-10-01) https://www.biorxiv.org/content/10.1101/2020.10.01.322206v1
DOI: 10.1101/2020.10.01.322206
Or you can use manual references to correct this if only a small number of citations are affected.
Thanks for the tip. The preprint I refer to was released 2 weeks ago so I wonder if there has been another change. But will use the work around as you suggest if needed!
Okay followed up with bioRxiv via tweet. This issue has never been fixed at the source, in that bioRxiv doesn't set container-title
when depositing Crossref metadata. Therefore, our citation style falls back to showing the publisher, which is "Cold Spring Harbor Laboratory".
@cgreene heard back from one of the bioRxiv developers. The relevant parts of their response:
Concerning the use of the
container-title
field in the metadata our guidance from Crossref is that this field is not appropriate for the preprint server name. Note, however, that the server name is captured in the institution field, for instance:<institution> <institution_name>bioRxiv</institution_name> </institution>
There isn't a CSL JSON variable for institution
, so we can't access that field.
Also the temporary datacite fix discussed in https://github.com/manubot/manubot/issues/16#issuecomment-643271145 is no longer active.
If bioRxiv is following guidance from Crossref, following up with Crossref may be the next step.
From the Crossref schema docs
institution: Wrapper element for information about an organization that sponsored or hosted an item but is not the publisher of the item. The institution element should be used to deposit metadata about an organization that sponsored or hosted the research or development of the published material but was not actually the publisher of the information. The institution is distinctly different from the publisher because it may not be a publishing organization. It is typically an organization such as a university, corporation, government agency, NGO or consortia. If the content was published by an organization other than the sponsor, the use of both the publisher and institution elements is encouraged because authors may cite either one in a reference, and the availability of both may allow for more precise matching in queries.
I don't see a container_title
field in the Crossref schema, such think this field only gets created upon CSL JSON conversion.
Preprints are considered posted content by Crossref. The schema page has this image for posted_content
:
Expanding the group_title
description:
group_title: Prepublication content items may be organzed into groupings within a given publisher. This element provides for naming the group. It is expected that publishers will have a small number of groups each of which reflect a topic or subject area.
So I think we have the following upstream questions:
To Crossref: does the posted_content
data model make it difficult to set the CSL JSON container-title
field for preprints? Should container-title
always be the institution
in the case of posted_content
?
To Datacite (whose Crossref to CSL JSON conversion we use): can the conversion be fixed without any changes by Crossref to set institution
as container-title
for posted_content
?
We have some updates on the upstream queries to Crossref.
Via Twitter:
Yes, the schema never distinguished preprint server "title" from "institution" but it could do - we will add this to the list. We'll be reviving the preprints working group soon to reassess the entire workflow - I will let you know when. (-GH)
In response to my support ticket:
Thanks again for this feedback - we are definitely planning to revisit the metadata we collect and distribute for preprints soon, including how we manage the preprint server name, later this year, but are also looking into ways we can address the preprint server name in our outputs in the interim. I'll keep you posted on our progress.
e.g. 63 of https://apeltzer.github.io/eager2-paper/
This happens for both the reference and in-text citation tooltip