GlobalNamesArchitecture / gnparser

Split scientific names to meaningful elements with meta information
https://parser.globalnames.org/
MIT License
20 stars 2 forks source link

Support "emend." and "emend" #434

Closed dimus closed 6 years ago

dimus commented 6 years ago

Word "emend." is an abbreviation from "emendatum" or "rectified by". It is introduced when one author refines work of another author. Normally "emend." would be with a dot, but sometimes dot might be forgotten. The treatment of "emend." is quite similar structurally with "ex, in". And example from biodiversity ruby gem:

Cleome kermesina Gilg & Gilg-Ben. emend. Kers var. plebeia Kers|{"scientificName":{"id":"291ce8f1-0e40-5ed5-afbf-a7a11b0bba59", "parsed":true, "parser_version":"test_version", "verbatim":"Cleome kermesina Gilg & Gilg-Ben. emend. Kers var. plebeia Kers", "normalized":"Cleome kermesina Gilg & Gilg-Ben. emend. Kers var. plebeia Kers", "canonical":"Cleome kermesina plebeia", "hybrid":false, "details":[{"genus":{"string":"Cleome"}, "species":{"string":"kermesina", "authorship":"Gilg & Gilg-Ben. emend. Kers", "basionymAuthorTeam":{"authorTeam":"Gilg & Gilg-Ben.", "author":["Gilg", "Gilg-Ben."], "emendAuthorTeam":{"authorTeam":"Kers", "author":["Kers"]}}}, "infraspecies":[{"string":"plebeia", "rank":"var.", "authorship":"Kers", "basionymAuthorTeam":{"authorTeam":"Kers", "author":["Kers"]}}]}], "parser_run":1, "positions":{"0":["genus", 6], "7":["species", 16], "17":["author_word", 21], "24":["author_word", 33], "41":["author_word", 45], "46":["infraspecific_type", 50], "51":["infraspecies", 58], "59":["author_word", 63]}}}

Possible variants:

Cleome kermesina Gilg & Gilg-Ben. emend. Kers var. plebeia Kers
Cleome kermesina (Gilg & Gilg-Ben. emend. Kers) Kers 1888
Cleome kermesina (Gilg & Gilg-Ben. emend. Kers) Kers emend Roskov 2000
alexander-myltsev commented 6 years ago

@dimus do you think it could be simply added to the rule?https://github.com/GlobalNamesArchitecture/gnparser/blob/4dc7388c56c808488003ee8712ac4bf640e95198/parser/src/main/scala/org/globalnames/parser/Parser.scala#L505

alexander-myltsev commented 6 years ago

It has the same context as authorEx, but it is semantically different. Better to introduce separate rule