Open de-code opened 4 years ago
"Collaboration" are supported in the bibliographical references since a few years (the effort was driven by HEP!), it works well if I remember well, but beyong HEP collaborations, there is almost no training example with "consortium" currently to extend the coverage.
For the header, they are annotated in the new header training data as <note type="group">
:
<byline>
<docAuthor>Zuo-Teng Wang 1 , Shi-Dong Chen 2 , Wei Xu 1 , Ke-Liang Chen 2 , Hui-Fu Wang 3 , Chen-Chen Tan 3 , Mei<lb/> Cui 2 , Qiang Dong 2 , Lan Tan 1,3 , Jin-Tai Yu 2 , </docAuthor>
</byline>
<note type="group">Alzheimer's Disease Neuroimaging Initiative *</note>
and label group
by the sequence labelling... but not present in the output because there's not enough training data yet.
There currently doesn't seem to be support for group authors.
Example:
048991v1
(10.1101/048991) with more author groups.PDF:
GROBID 0.6.1 extracted them as affiliations:
Another example with group authors but GROBID seem to have failed to extract the authors:
269571v1
(10.1101/269571)PDF:
bioRxiv XML:
I couldn't find support for it in the code.