Closed Melissa37 closed 4 years ago
Note to self:
The following XQuery returns a list of funder names:
let $fundref := fetch:xml('https://gitlab.com/crossref/open_funder_registry/raw/master/registry.rdf')
for $x in $fundref//*:Concept
return $x//*:literalForm/data()
The list of funders is so massive that I don't think Schematron is the correct way for this to checked. It would mean that validation takes so long that it would become unusable.
Going to explore using basex instead and integrating this with the basex validation module that we are testing.
Related to this but not addressing the specific problem, I suggest that we need an error test to identify a scenario in which there are two (or more) funding entries with the same funder, but only one has a fundref id.
Anecdotally I've seen this in production myself a couple of times, so it should serve as useful and go some (small) way to mitigate the problem here.
Added test in. This checks for the presence of the preferred label for each funder in Fundref and fires if their doi is not in the funding-group
.
Using regex in this context is problematic - so much memory is used that the whole schematron becomes unusable - therefore the function contains()
is used. This means the check is quick but it carries with it limitations, which I have listed below.
I'm going to close this ticket for now. We can re-open if there's a need to refine this test.
Funders have numerous different names. These are defined in that file as preferred or alternative labels. This check only for the presence of preferred labels, i.e. National Science Foundation
, not NSF
. We could add these in manually if needed, but allowing all variants leads to this being flagged far too often to be useful (some alternative labels are 'as', 'why' etc.)
It's reliant on the implementation of our house style in the acknowledgements (removal of full stops and unnecessary spaces from funder names) For example
W. E. B. Du Bois Institute for African and African American Research, Harvard University
wouldn't flag but
WEB Du Bois Institute for African and African American Research, Harvard University
would.
Casing has to be the same as it is in fundref - i.e. National Science Foundation
instead of National science foundation
. Again, this is because if casing was ignored, the rule would fire so often that it's usefulness would deteriorate.
Background
Authors traditionally have added their funding to their acknowledgments and it is a relatively new thing to have a separate funding section (which not all publishers do yet anyway). This means sometimes our authors do not add their funding details to the section provided in EJP OR they fill in the funding section and retain funding information in the acknowledgements
Describe what you would like tested
The Open Funder Registry is found on GitLab: https://github.com/Crossref/open-funder-registry. It is a downloadable file. It is updated on an ad hoc basis, so to ensure a local file is updated this repo will need to be followed.
elements affected
<ack id="ack">
Suggested schematron message
A funder in the open funder registry is mentioned in the acknowledgments but not listed in the funding section. Please check
Suggested role (warning or error)
warning (because it might be appropriate they are not listed as a direct funder) eLife will have to provide examples of when not to add the funder to the funder list in the Wiki
Stage
pre-edit
Example
None as yet