factsmission / synospecies

Using Plazi Data to find currently accepted scientific names
https://synospecies.plazi.org/
MIT License
5 stars 1 forks source link

wrong authorities - taxonomic name X authority combination #134

Open myrmoteras opened 7 months ago

myrmoteras commented 7 months ago

@retog @nleanba here are issues with taxonomic nameXauthority combination issues, exempliefied for Trex by @millerjeremya.
can you please look into this and help to understand, why this occurs and how we can fix this?

One element @flsimoes could be that the import and conversion does not make a difference between the botanical and zoological code.

May we can spend some time coming Thursday in berlin on this topic?

Hi Donat, Sure, let's work through the examples. (I ended up going through all of them in the Tyrannosaurus rex search). What else would be helpful?

here is the XLS including all the cases below

First is Tyrannosaurus rex, described by Osborn in 1905 in the genus Tyrannosaurus. By conventions in zoology, we would write that authority as Osborn, 1905 without parentheses. So, the entry in Synospecies that appears as Tyrannosaurus rex (Osborn, 1905) / Osborn, 1905 should be Tyrannosaurus rex Osborn, 1905

Albertosaurus megagracilis Paul, 1988 The species megagracilis was described by Paul in 1988 in the genus Albertosaurus, so this is correct as it currently appears on Synospecies.

Aublysodon mirandus (Leidy, 1868) / Leidy, 1868 The species mirandus was described by Leidy in 1868 in the genus Aublysodon, so this should appear as: Aublysodon mirandus Leidy, 1868

Aublysodon molnari Paul, 1988 This is a modification of the spelling of the species molnaris described by Paul in 1988 in the genus Aublysodon; this is correct as it currently appears on Synospecies.

Aublysodon molnaris Paul, 1988 This species was described by Paul in 1988 in the genus Aublysodon so this is correct as it currently appears on Synospecies.

Dinotyrannus megagracilis (Paul, 1988) Olshevsky, 1995 / (Paul, 1988) The species megagracilis was described by Paul in 1988 in the genus Albertosaurus. So this is the first case in the list where the authority should be in parentheses: Dinotyrannus megagracilis (Paul, 1988)

Dynamosaurus imperiosus (Osborn, 1905) / Osborn, 1905 The species imperiosus was described by Osborn in 1905 in the genus Dynamosaurus, so this should appear as: Dynamosaurus imperiosus Osborn, 1905

Gorgosaurus lancensis Gilmore, 1946 The species lancensis was described by Gilmore in 1946 in the genus Gorgosaurus so this is correct as it currently appears on Synospecies.

Manospondylus gigas Cope, 1892 This species was described by Cope in 1892 in the genus Monospondylus so this is correct as it currently appears on Synospecies.

Nanotyrannus lancensis (Gilmore, 1946) / Gilmore, 1946 / (Gilmore, 1946) Bak... The species lancensis was described by Gilmore in 1946 in the genus Gorgosaurus, so this is a case where the authority should be in parentheses: Nanotyrannus lancensis (Gilmore, 1946)

Stygivenator molneri (Paul, 1988) The species molneri was described by Paul in 1988 in the genus Aublysodon, so this is a case where the authority should be in parentheses; this is correct as it currently appears on Synospecies.

Albertosaurus cfalancensis (Gilmore, 1946) Molnar, 1980 Here, something has gone wrong. I include a screenshot of the treatment being cited, with a citation to Albertosaurus cf. A. lancensis. This conveys that the species is similar to (/ possibly the same as) Albertosaurus lancensis, which was described by Gilmore in 1946 in the genus Gorgosaurus. So this should appear as: Albertosaurus cf. A. lancensis (Gilmore, 1946)

image

Switching to brief format: Albertosaurus lancensis (Gilmore, 1946) Russell, 1970 / (Gilmore, 1946) should be Albertosaurus lancensis (Gilmore, 1946)

Stygivenator molnari Olshevsky, Ford, & Yamamoto, 1995 should be Stygivenator molnari (Paul, 1988)

Aublysodon horridus (Leidy, 1856) is correct.

Deinodon horridus Leidy, 1856 is correct.

Aublysodon amplus Marsh, 1892 is correct.

Aublysodon cristatus Marsh, 1892 is correct.

Ornithomimus altus Lambe, 1902 is correct.

Nanotyrannus lancensis (Bakker, Williams, & Currie, 1988) should be Nanotyrannus lancensis (Gilmore, 1946)

Stygivenator cristatus (Marsh, 1892) is correct.

Tyrannosaurus sp (Osborn, 1905) Matthew & Brown, 1922 Debatable case. Since there is not an explicit species epithet, the usual authority conventions do not apply. This treatment is a concept articulated in Matthew & Brown, 1922 with a tentative or approximate link to nomenclature. Perhaps in cases like this, we could apply square brackets: Tyrannosaurus sp [Matthew & Brown, 1922] But this is not linked to accepted conventions in zoology, so maybe we should consult with some experts in nomenclature.

Struthiomimus altus Osborn, 1917 Here again something has gone wrong. The species altus was described by Lambe in 1902 in the genus Ornithomimus, so this should appear as: Struthiomimus altus (Lambe, 1902); I don't know where Osborn 1917 came from.

Stygivenator amplus (Marsh, 1892) is correct.

Dinodon horridus Leidy, 1856 is correct.

Aublysodon lateralis Cope, 1876 / (Cope, 1876) should be Aublysodon lateralis Cope, 1876

Dryptosaurus kenabekides Hay, 1899

Laelaps hazenianus (Hay, 1902) should be Laelaps hazenianus Cope, 1876; I don't know where Hay, 1902 came from.

Laelaps incrassatus (Cope, 1876) / Cope, 1876 should be Laelaps incrassatus Cope, 1876

Ornithomimus grandis Marsh, 1890 is correct.

Deinodon cristatus (Marsh, 1892) is correct.

Struthiomimus altus (Lambe, 1902) is correct.

Deinodon amplus (Marsh, 1892) is correct.

Manospondylus amplus (Marsh, 1892) is correct.

Tyrannosaurus amplus (Marsh, 1892) is correct.

Laelaps hazenianus Cope, 1876 is correct.

Laelaps incrassatus Cope, 1892 is correct.

Albertosaurus sarcophagus Osborn, 1905 is correct.

Deinodon sarcophagus (Osborn, 1905) is correct.

Albertosaurus arctunguis Parks, 1928 is correct.

Dryptosaurus incrassatus (Cope, 1892) Lambe, 1904 should be Dryptosaurus incrassatus (Cope, 1892)

nleanba commented 7 months ago

I am open to suggestions if this should be changed, but this is how it works currently:

Every Taxon Concept is annotated in the RDF with a dwc:scientificNameAuthorship property. If multiple differing values are present (from different treatments, more below), its shows them all as a /-separated list.

dwc:scientificNameAuthorship is generated from the XML annotation, specifically the authorityName, -Year, baseAuthorityName and -Year properties if they are present. Otherwise, it uses the authority property. As a last resort, and only if it is a defining treatment, it will use the treatment authors and year to construct an authority.

This process involves quite a bit of normalization. However, if different treatments have "incompatible" annotations it will produce varying output leading to /s in SynoSpecies (e.g. some treatments list Osborn 1907 as baseAuthority for T.rex, some list it as authority -- leading to parentheses added inconsistently).

In my opinion, this is mostly an issue in the XML annotations, as gg2rdf has no way of figuring out from looking at a single treatment if it should accept its assertions about (base)authority.

The following SPARQL query lists all different combinations of authority-properties for T.rex:

PREFIX dwc: <http://rs.tdwg.org/dwc/terms/>
SELECT DISTINCT ?rdf_file ?displayed ?authority_in_xml ?authorityName_in_xml ?authorityYear_in_xml ?baseAuthority_in_xml ?baseAuthorityName_in_xml ?baseAuthorityYear_in_xml WHERE {
  BIND(<http://taxon-concept.plazi.org/id/Animalia/Tyrannosaurus_rex_Osborn_1905> as ?tc)
  GRAPH ?graph {
     ?tc dwc:scientificNameAuthorship ?displayed .
    OPTIONAL { ?tc dwc:authority ?authority_in_xml . }
    OPTIONAL { ?tc dwc:authorityName ?authorityName_in_xml . }
    OPTIONAL { ?tc dwc:authorityYear ?authorityYear_in_xml . }
    OPTIONAL { ?tc dwc:baseAuthority ?baseAuthority_in_xml . }
    OPTIONAL { ?tc dwc:baseAuthorityName ?baseAuthorityName_in_xml . }
    OPTIONAL { ?tc dwc:baseAuthorityYear ?baseAuthorityYear_in_xml . }
    BIND(IRI(CONCAT(STR(?graph),".ttl")) as ?rdf_file)
  }
}

(Run it in the advanced-tab of SynoSpecies. Only works with the https://treatment.ld.plazi.org/sparql SPARQL-endpoint, as Lindas does not include the necessary information to find the originating rdf/xml-files, indicated in the ?graph column)

This query can be adapted to show the same information for other taxa by replacing the URL in the 3rd line. (The relevant URL can be found by expanding the "Justification" on the taxon-concept and copying the URL of the link with the name)

E.g. for Struthiomimus altus:

Struthiomimus altus Osborn, 1917 Here again something has gone wrong. The species altus was described by Lambe in 1902 in the genus Ornithomimus, so this should appear as: Struthiomimus altus (Lambe, 1902); I don't know where Osborn 1917 came from.

It tells us that https://treatment.plazi.org/id/D35787D0FF90157AEF43FC8EFD33FCFB is the culprit, with authorityName: Osborn and -Year: 1917 annotated in the XML.

Regarding the "cflamcemsis":

Albertosaurus cfalancensis (Gilmore, 1946) Molnar, 1980

This is a separate and unrelated bug in how SynoSpecies displays taxon names, I will make a separate issue for it. → #137

(Relevant code generating the authority info for RDF: https://github.com/plazi/gg2rdf/blob/6c22085fff91dcf6a86723ffe12bf907591c7009/src/gg2rdf.ts#L495)

nleanba commented 7 months ago

Addendum: Is it desirable (for Animalia) if in cases where both authority_ and baseAuthority_ are given, that only the latter is displayed?

See also https://github.com/plazi/names_LOD/issues/154