CatalogueOfLife / data

Repository for COL content
8 stars 2 forks source link

Subspecific epithet "ex" not allowed #140

Open aoern opened 4 years ago

aoern commented 4 years ago

@yroskov @gdower

The following taxon has been deleted from the CoL database Jun 2020: Agrodiaetus transcaspica ex Forster, 1956 (GloBIS (GART))

Possibly a parsing error ("ex" not accepted as a subspecific epithet)?

yroskov commented 4 years ago

@mdoering Markus, I cannot find a reason, why Agrodiaetus transcaspica ex was excluded from CoL: https://data.catalogue.life/catalogue/3/dataset/1046/workbench?facet=rank&facet=issue&facet=status&facet=nomstatus&facet=type&facet=field&limit=50&offset=0&q=Agrodiaetus%20transcaspica%20ex

yroskov commented 4 years ago

Let me check with @dhobern validity of the name according ICZN:

Donald, could be word "ex" used as a subspecies/species epithet in zoology? http://www.catalogueoflife.org/annual-checklist/2019/details/species/id/94526a0d02976cea769b4bf608c5f984

I failed to find this taxon in GloBIS (hmm, is the project dead?); LepIndex has it: https://www.nhm.ac.uk/our-science/data/lepindex/detail/?taxonno=201486&&snoc=ex&search_type=starts&genus=agrodiaetus&sort=snoc&indexed_from=1&page_no=1&page_size=30&path=advanced

dhobern commented 4 years ago

This is an odd case. The code does not prevent authors from using 2-character epithets and there is no list of excluded epithets, so I can see no reason why A. t. ex would be invalid.

However, given the fact that "ex" is standardly used alongside names to refer to the source for the name, it did look odd. Fortunately Zobodat has the paper:

https://www.zobodat.at/publikation_articles.php?id=80724

Click on the title to get the PDF. The paper is in German, but the immediate clue is in comparing the names published in the paper:

This isn't published as a new name. It simply says that there is another subspecies known from Wan in Kurdistan (now Van in Turkey). Forster writes, "Diese Form aus dem Gebiete von Wan, sicherlich eine ausgeprägte Subspecies, kann erst näher charakterisiert werden, wenn mehr Material aus Kurdistan zum Vergleich vorliegt" - more material is needed before the subspecies can be described.

So the LepIndex card was based on a misinterpretation of the original paper. Christoph Haüser et al. are working on a replacement for GART. They have specifically said that they see no point in chasing down all of the historical subspecific names for butterflies, since there are so very many and most were never reused and are effectively meaningless. This paper by Forster is rather different since e.g. the first named subspecies is now considered to be a full species Polyommatus ninae (Forster, 1956).

mdoering commented 4 years ago

From a technical point the name parser would digest the name wrongly, making it a species:

https://api.catalogue.life/parser/name?name=Agrodiaetus%20transcaspica%20ex%20Forster,%201956
[
  {
    "name": {
      "scientificName": "Agrodiaetus transcaspica",
      "rank": "species",
      "genus": "Agrodiaetus",
      "specificEpithet": "transcaspica",
      "type": "scientific",
      "parsed": true,
      "labelHtml": "<i>Agrodiaetus</i> <i>transcaspica</i>"
    },
    "issues": [
      "partially parsable name"
    ]
  }
]

Now that we have the parser configs we should teach it to parse this name as a subspecies, I will try that out today. We should lookout for more good names using ex as an epithet

dhobern commented 4 years ago

Just to be clear (as I tried to explain) - this is NOT a good name. If you are talking just about fixing the parser to solve failings around names structured "[A-Z][a-z]+ [a-z]-?[a-z]+ [a-z]{2}", then go ahead, but don't do it because you assume that there are names with "ex" as a good epithet. As a general assumption, this is unlikely as it is such a shockingly stupid choice.

mdoering commented 4 years ago

Yes, but I would still like to see our systems to behave well with these bad names. And we do have a mechanism now to override wrong parsing results using manual configurations. Thats hardly any extra work and was built exactly to avoid changing the parser itself for a few exceptional names

mdoering commented 4 years ago

added to parser: https://api.catalogue.life/parser/name/config?q=Agrodiaetus

now it parses fine: https://api.catalogue.life/parser/name?name=Agrodiaetus%20transcaspica%20ex%20Forster,%201956

gdower commented 4 years ago

I think the infraspecificEpithet should be set to "" because ex is not a name. The parser is going to be used primarily by machines that won't read the remarks and most likely won't forward the parser remarks to whatever service uses the output, so ex will become a name even though it's just pointing to a reference that says the OTU exists but still isn't described yet. I suggest parsing it like this:

[
  {
    "name": {
      "scientificName": "Agrodiaetus transcaspica ex",
      "authorship": "Forster, 1956",
      "rank": "subspecies",
      "genus": "Agrodiaetus",
      "specificEpithet": "transcaspica",
      "infraspecificEpithet": "",
      "combinationAuthorship": {
        "authors": [
          "Forster"
        ],
        "year": "1956"
      },
      "code": "zoological",
      "type": "scientific",
      "remarks": "ex isn't published as a new name. The publication by Forster simply says that there is another subspecies known from Wan in Kurdistan (ex Wan).",
      "parsed": true,
    }
  }
]
mdoering commented 4 years ago

Good point. But parsing the name as a subspecies with ex does not mean it has to be an accepted or even available name. For CoL I guess this name should be blocked. But when A.t.ex appears in the GloBIS dataset, isn't it (wrongly) treated as a subspecies there and should it therefore not be parsed like that? Judging whether the name is available or whatever status is has is a different story to be dealt with regular editorial decisions. Otherwise we might have curious situations when the source claims its a subspecies and its parent is a species, but the parser then says its a species - linked to another species

mdoering commented 4 years ago

e.g. when you process this record: https://www.nhm.ac.uk/our-science/data/lepindex/detail/?taxonno=201486&&snoc=ex&search_type=starts&genus=agrodiaetus&sort=snoc&indexed_from=1&page_no=1&page_size=30&path=advanced

It should be parsed as a subspecies with ex

mdoering commented 4 years ago

THis source lists both ex Wan and ex :) http://ftp.funet.fi/index/Science/bio/life/insecta/lepidoptera/ditrysia/papilionoidea/lycaenidae/polyommatinae/polyommatus/index.html#transcaspica