Open rgieseke opened 3 years ago
I think the ideal way to handle a <mixed-citation>
would be to try to "decode" it (ie. parse it) into a CreativeWork
. If we do that then in text Cite
nodes will work as expected (ie. show authors and years if needed).
With the fix that you made the entirety of the <mixed-citation>
type: Article
id: pone-0091296-Choat1
authors: []
title: >-
Choat JH (2012) Spawning aggregations in reef fishes; ecological and
evolutionary processes. In: Sadovy de Mitcheson Y, Colin PL, editors. Reef
Fish Spawning Aggregations: Biology, Research and Management. Heidelberg:
Springer. pp. 85–116.
whereas what we want is the bibliographic info to be parsed out of the <mixed-citation>
into
type: CreativeWork
authors:
- type: Person
familyNames:
- Choat
givenNames:
- John Howard
datePublished:
type: Date
value: '2011-09-20'
identifiers:
- type: PropertyValue
name: doi
propertyID: https://registry.identifiers.org/registry/doi
value: 10.1007/978-94-007-1980-4_4
isPartOf:
type: Periodical
name: 'Reef Fish Spawning Aggregations: Biology, Research and Management'
publisher:
type: Organization
name: Springer Netherlands
title: Spawning Aggregations in Reef Fishes; Ecological and Evolutionary Processes
url: http://dx.doi.org/10.1007/978-94-007-1980-4_4
In Encoda, rather than trying to parse references into a CreativeWork
, we take the approach suggested here and query CrossRef for bibliographic info. I didn't write the above YAML out by hand but rather used the crossref
codec:
./encoda convert "Choat JH (2012) Spawning aggregations in reef fishes; evolutionary processes." --from crossref - --to yaml
I suggest that we use this approach for JATS <mixed-citation>
(as we do in the reshape
function). However, I think it would be wise to perhaps put it in name
or alternateNames
or similar (I think description
should be avoided because that is where the abstract goes and in some cases we actually have that; and comment
has a different semantic structure) and then do the CrossRef querying as a separate enrichment step that won't cause a failure, if for instance there is no network connection.
:tada: This issue has been resolved in version 0.111.0 :tada:
The release is available on:
Your semantic-release bot :package::rocket:
I suggest that we use this approach for JATS
(as we do in the reshape function). However, I think it would be wise to perhaps put it in name or alternateNames or similar (I think description should be avoided because that is where the abstract goes and in some cases we actually have that; and comment has a different semantic structure) and then do the CrossRef querying as a separate enrichment step that won't cause a failure, if for instance there is no network connection.
Yes, i was mistakenly thinking that description
was belonging to the citation and not the entire creativeWork.
The CrossRef querying approach sounds great, how could that work?
Should it be an extra conversion? JATS to CrossRef enhanced JATS? Or should it be tried in the JATS codec?
Should it be an extra conversion? JATS to CrossRef enhanced JATS?
Yes, that is what I advocating for above. It shouldn't be part of the decode
method of the JatsCodec
but rather part of a generic function which can be applied to references
of any Article
no matter which format it originated from. That is exactly what currently happens here in the reshape
function but it is currently "converting" paragraphs into CreativeWork
s using CrossRef:
I think this code should be factored our into a separate enrich
function and applied to Paragraphs
in the references section, but also to string
items in the references
property of any CreativeWorks
(I had forgotten that string
is a valid item in references
).
In summary, what needs to happen if we take this direction is:
jats
codec, return a string
for <mixed-citation>
instead of setting the title: https://github.com/stencila/encoda/blob/8dc58e335a345219a389df30114eaab397df19ce/src/codecs/jats/index.ts#L1165-L1167reshape
function, turn Paragraph
s that are in the "References" or "Bibliography" section into plain string
items in Article.references
reshape
function and put into a new enrich
function where it is applied to any item in Article.references
that is a string
enrich
after decode
by default but allow user to disable it (like we do for coerce
and reshape
) https://github.com/stencila/encoda/blob/39813e0e660743964dee17cd19fff55069113c7a/src/codecs/types.ts#L271-L273
Mixed citations (e.g. originally from bibitems in LaTeX) are not parsed when reading from JATS-XML.
I think the
mixed-content
should actually bemixed-citation
: https://github.com/stencila/encoda/blob/master/src/codecs/jats/index.ts#L1165Even nicer would probably be to have all elements of the
mixed-citation
as inline elements to keep e.g. parts in italics.Maybe this could be filed as
description
orcomment
?https://schema.stenci.la/creativework