globalbioticinteractions / nomer

maps identifiers and names to other identifiers and names
GNU General Public License v3.0
18 stars 3 forks source link

Unrecognized Hybrids in Taxonomy #136

Open jtmiller28 opened 1 year ago

jtmiller28 commented 1 year ago

I was hoping to start a discussion on what to do in datasets with hybrids that are not currently recognized within catalogues. It seems there are some well known groups that hybridize with a wealth of literature behind them; however, their scientific names are not structured in a way that allows for resolution at large scale. Hybrids are of particular interest in biogeography, as they provide means for testing questions about interaction intermediates, ecological viability, distribution compared to parent populations, etc so it would be a rather impactful loss when resolving taxonomy.

The issue first appears in parsing, where the format of the name can be either:
genus + specificEpithet × genus + specificEpithet OR genus + specificEpithet × specificEpithet While gn-parser seems to be capable of handling this type of name, gbif-parse seems to fail. I'll show the first name example, but the other variation yields the same results.

echo -e "\tTragopogon dubius × Tragopogon porrifolius" | nomer replace gn-parse Tragopogon dubius × Tragopogon porrifolius

echo -e "\tTragopogon dubius × Tragopogon porrifolius" | nomer replace gbif-parse java.lang.RuntimeException: failed to apply taxon

I've checked this output in other formats using gbif-parsing tool page & R, it appears that names of this structure will fail to parse. GBIF appears to take a conservative approach to this issue, processing the name to a higher taxonomic rank (genus: Tragopogon). Ex here: https://www.gbif.org/occurrence/2573489546

Even when parsing correctly occurs it appears to be unlikely for a catalogue to have a hybrid as a registered name outside of cultivars. echo -e "\tTragopogon dubius × Tragopogon porrifolius" | nomer replace gn-parse | nomer append wfo Tragopogon dubius × Tragopogon porrifolius NONE Tragopogon dubius × Tragopogon porrifolius

While not surprising, as species concepts get very muddled here, this means that the wfo catalogue does not have the capability to resolve these types of names. There are some instances of hybrids registered in the WFO catalogue, however they don't appear to be standardized with how name-alignment usually proceeds at least to me. See ex: http://www.worldfloraonline.org/taxon/wfo-4000042576 This is further complicated by not all groups having those registered hybrids in their taxonomy, as is the case for Tragopogon.

I am wondering if a way to get around this would be to break the name into two pieces (Genus + specificEpithet) (Genus + specificEpithet) and resolve each individually. If both compose valid scientificNames, we could create a "confident" hybrid? This isn't ideal, but it would allow for finer grain resolution on names of this nature. Just my thoughts, curious to hear if anyone has encountered this issue and has any ideas on resolving names of this composition.