Open DianRHR opened 7 months ago
Mycotretus discipennis subsp. conductus Kuhnt, 1910 for example exists twice: https://www.dev.checklistbank.org/dataset/271349/taxon/~4Gik https://www.dev.checklistbank.org/dataset/271349/taxon/~4Glb
And occurs twice in the Plazi source, once accepted, once as a synonym: https://www.dev.checklistbank.org/dataset/56536/duplicates?limit=50&rank=subspecies
I believe all the did not merge problems are due to missing name matches! There was a very hard to spot bug in the rematch function that prevented most matches to persist.
And occurs twice in the Plazi source, once accepted, once as a synonym: https://www.dev.checklistbank.org/dataset/56536/duplicates?limit=50&rank=subspecies
Looking at the article, found that Mycotretus discipennis conductus Kuhnt, 1910 [as a variety] is synonym of Mycotretus deyrollei Crotch, 1876 AND synonym of Mycotretus discipennis conductus Kuhnt, 1910 (subspecies) . So ... the original info is that there is the same name trinomial with different rank, but also ... the "variety" points to different accepted names. These kind of errors are out of our hands.
Besides, other species were merged more than once, depending on the number of subspecies related: https://www.dev.checklistbank.org/dataset/271349/classification?taxonKey=3349c2de-e025-4ab9-89a8-194d32b30a32
same as reported in #87
In the XCOL-2023-11-20 and XCOL-2023-11-29 some Genus were merged just below Biota, even that the original source includes higher taxonomy, and the family was merged correctly (as other genus from the same family). Example: https://www.dev.checklistbank.org/dataset/274825/classification?taxonKey=~OZO
DynTaxa (original source) includes the genus with complete higher taxonomy: https://www.dev.checklistbank.org/dataset/2041/taxon/urn%3Alsid%3Adyntaxa.se%3ATaxon%3A1015231
However, the Family Pelonematacee was merged properly in XCOL (and it comes from the same source): https://www.dev.checklistbank.org/dataset/274825/classification?taxonKey=~1eks
The cases of Peloploca and Pelonema are solved and properly merged.
The case of subspecies of Mycotretus is not possible to confirm until we merge Plazi datasets.
However, some genus are merging below the kingdom level because the original source either doesn't include higher taxonomy or it is different from the base COL. Some of these cases are also generating duplicates:
Example: Genus Cribraria
was merged below Fungi (merged from Brazilian Flora), even though it was already in the baseCOL below Cribrariales | Cribrariaceae and in both cases have the same author. Besides, all the species merged below Cribaria (at the lingdom level) are duplicated as well.
A possible solution for this cases could be to modify the code considering: avoid merging if it's the same genus (and author) but with different higher taxonomy, descendants could be merged if not present already.
There are several subspecies that were not merged properly into the xCOL even though some other subspecies from the same genus and source were merged correctly:
Here is the example of genus Mycotretus:
Looking for the genus Mycotretus in the XCOL-2023-10-26 found this:
https://www.dev.checklistbank.org/dataset/271349/classification?taxonKey=5W6T![image](https://github.com/CatalogueOfLife/xcol/assets/124217901/80610e58-fdf0-490d-87e5-6a146ea18181)
Besides, other species were merged more than once, depending on the number of subspecies related: https://www.dev.checklistbank.org/dataset/271349/classification?taxonKey=3349c2de-e025-4ab9-89a8-194d32b30a32
Or were merged more than once, one with author and one without author: https://www.dev.checklistbank.org/dataset/271349/classification?taxonKey=e4aabe0e-9a9d-49eb-a400-0492fc8c77bd![image](https://github.com/CatalogueOfLife/xcol/assets/124217901/21d4882c-c96f-4963-a85b-fdda55028edc)