gbif / name-parser

The core GBIF scientific name parser library
Apache License 2.0
18 stars 4 forks source link

How to treat sic comments #66

Closed mdoering closed 3 years ago

mdoering commented 4 years ago

original spellings are sometimes given with a sic. prefix notation, e.g. Turbo porphyrites [sic, porphyria]

Should we parse these out? leave them in the unparsed bit which means it has to be at the end?

mdoering commented 4 years ago

see also #33

mdoering commented 3 years ago

This example really contains 2 names. If parsed into a single name I would propose to omit the original spelling and treat it just as it were Turbo porphyrites. Splitting this into 2 related names should be done in a different level then name parsing which is targeting a single name only