gnames / gnfinder

GNfinder finds scientific names in UTF8 texts, PDF files, MS Word/Excel documents, URLs etc.
MIT License
44 stars 5 forks source link

Detect compound nomen annotations like "nov. gen., nov. spec." #145

Open kbseah opened 1 year ago

kbseah commented 1 year ago

Example in the title of this journal article: Pentahymena corticicola nov. gen., nov. spec., a New Colpodid Ciliate (Protozoa, Ciliophora) from Bark of Acacia Trees in Costa Rica

Detection of the nomen annotations is a great feature to find first descriptions of taxa. I've been playing around with using it to link publication records on Wikidata to the taxa they describe by parsing the titles, and it looks promising. However "compound" annotations like the one above don't get detected by gnfinder. It would be a useful enhancement to detect them too; they occur quite often in the ciliate literature.

Thanks!

dimus commented 1 year ago

good point @kbseah, thank you for the suggestion