CatalogueOfLife / xcol

Working towards the extended Catalogue of Life Checklist
0 stars 0 forks source link

Same genus (from the same source) included more than once #87

Closed DianRHR closed 3 weeks ago

DianRHR commented 1 year ago

The genus Mycetochara (Tenerbionidae) was included more than once in the xrelease W38, even that the original source has it once with 12 children species: image

In contrast, other genus in the same family were merged correctly, coming from different sources (first image) or from just one source (second image): image

image

camiplata commented 1 year ago

Mycetochara, Myrmechixenus, and Nautes did not merge on the last release

Captura de pantalla 2023-11-02 a la(s) 9 58 44 a m Captura de pantalla 2023-11-02 a la(s) 9 59 13 a m

But Micrarmalia did:

Captura de pantalla 2023-11-02 a la(s) 10 00 10 a m
mdoering commented 11 months ago

not any longer. @camiplata please verify: https://www.dev.checklistbank.org/dataset/275521/names?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=50&offset=0&q=Mycetochara&sortBy=taxonomic

camiplata commented 9 months ago

On the January xrelease this problem re appeared but now with a plazi dataset: https://www.dev.checklistbank.org/dataset/279769/classification?taxonKey=fc810172-fbbc-4801-afd9-4599eafb05cf

Captura de pantalla 2024-01-15 a la(s) 12 15 54 p m

on the original source the name is not duplicated. https://www.dev.checklistbank.org/dataset/39672/classification?taxonKey=xV

Captura de pantalla 2024-01-15 a la(s) 12 23 46 p m

On the other side The catalogue of Tenebrionidae should have priority on top of any plazi dataset:

mdoering commented 4 months ago

this seems mostly solved. In a June release we got 2 Mycetochara genera, but with different authors.

Image

Having them both accepted in the same family is nonsense though. What should we do with such cases?

  1. accept the first, ignore subsequent ones and merge children/species in case the genera live in the same family?
  2. accepted the first, make subsequent ones provisionally accepted in case they are in different families, but same code?

other ideas?

mdoering commented 4 months ago

Suggest to ignore the second genus but include its children. Similar to how we merge higher taxa

DianRHR commented 4 months ago

I will go with your last sggestion, if they are in the same family. However, there are some other harder examples like the one I included a comment in #100 and my suggestion there is to avoid merging if it's the same genus (and author), but with different higher taxonomy, descendants could be merged if not present already.
In all cases apply priorities of the sectors.

camiplata commented 4 months ago

I agree with Diana's suggestion

mdoering commented 3 months ago

See also https://github.com/CatalogueOfLife/backend/issues/1329

mdoering commented 3 months ago

Things become even more difficult in the case of Mycetochara. There are not only 2 accepted, merged genera in Tenebrionoidea with different authors, but also a synonym Mycetochara Cameron from Staphylinoidea of the base COL. A merge currently looks like this:

          Coleoptera [order]
            Staphylinoidea Latreille, 1802 [superfamily]
              Staphylinidae Latreille, 1802 [family]
                Aleocharinae Fleming, 1821 [subfamily]
                  Aleocharini Fleming, 1821 [tribe]
                    Aleocharina Fleming, 1821 [subtribe]
                      Rencoma Blackwelder, 1952 [genus]
                        =Mycetochara Cameron, 1939 [genus]
                        ?Mycetochara cretica Mařan, 1954 [species]
                        ?Mycetochara excelsa Reitter, 1884 [species]
                        ?Mycetochara ocularis Reitter, 1884 [species]
                        ?Mycetochara thoracica (Gredler, 1854) [species]
                          =Mycetochares thoracica Gredler, 1854 [species]
                        Rencoma basiventris (Cameron, 1939) [species]
            Tenebrionoidea [superfamily]
              Tenebrionidae Latreille, 1802 [family]
                Mycetochara Berthold, 1827 [genus]
                  Mycetochara analis (LeConte, 1878) [species]
                  Mycetochara bicolor (Couper, 1865) [species]
                  Mycetochara binotata (Say, 1824) [species]
                  Mycetochara flavipes (Fabricius, 1792) [species]
                  Mycetochara foveata (LeConte,1866 [species]
                  Mycetochara fraterna (Say, 1824) [species]
                  Mycetochara maura (Fabricius, 1792) [species]
                    =Cistela linearis Illiger, 1794 [species]
                    =Cistela maura Fabricius, 1792 [species]
                    =Mycetochara hirsuta Pic, 1925 [species]
                    =Mycetochara linearis (Illiger, 1794) [species]
                  Mycetochara procera Casey, 1891 [species]
                Mycetochara Guérin-Méneville, 1827 [genus]

Because the synonym is the only instance of Mycetochara in the base COL, all IUCN species are placed as provisional species under the accepted genus Rencoma. This happens because the IUCN genus Mycetochara has no authorship and matches the synonym.

Other sources have a qualified genus name with a different authorship and thus create a new accepted genus under Tenebrionoidea.

Is there anything else we should do to place all M species under one genus for example?

mdoering commented 3 months ago

The GBIF Backbone has the same "problem". 3 Mycetochara genera, one of which is a synonym.

Rencoma has just 4 species, non of which are Mycetochara species.

The genus Mycetochara Guérin-Méneville has just 1 species.

Mycetochara Berthold does have most of the species including e.g. M. cretica from IUCN.

mdoering commented 3 months ago

This case is nicely treated in IRMNG:

Mycetochara Berthold, 1827 accepted as Mycetochara Guérin-Méneville in Bory de Saint-Vincent, 1827 [Family: Tenebrionidae] Mycetochara Cameron, 1939 accepted as Rencoma Blackwelder, 1952 [Family: Staphylinidae] Mycetochara Guérin-Méneville in Bory de Saint-Vincent, 1827 [Family: Tenebrionidae]

mdoering commented 3 months ago

Until we actually encounter the accepted genus Mycetochara Berthold the IUCN names are porperly placed under the synonyms accepted name Rencoma - which lives in the same order, but a different superfamily.

I don't think we can change the merge logic in that case. But we could add a post merge routine similar to homotypic grouping to look for such genus synonym cases and move provisional species to the accepted genus in case they are not too far apart, e.g. still in the same order.

DianRHR commented 3 months ago

Agree with your proposal Markus, probably even restricting if they have the same family could help in many cases.

The results of this task increased in comparison to the previous xrelease.

Some examples come from The Leipzig catalog of vascular plants, which ds doesn't include genus: image

image

In these examples, most of the species are already included in the baseCOL. And the genus shold be merging more than once if they are not in the baseCOL.

A similar behavior was reported in #109

mdoering commented 3 months ago

please verify this in the next build, it should not happen much any longer unless they are far apart. We could actually increase the comparison to order level to catch more of these cases.

camiplata commented 3 weeks ago

The genus behavior has been fixed.

For the Rencoma -Mycetochara problem, I will try giving priority to the source that provides the Mycetochara Berthold, 1827 genus, over IUCN so the genus is created before hand and we can get a better placement.

mdoering commented 3 weeks ago

Seems fixed?

image image image