CatalogueOfLife / testing

Editorial tests and discussion to prepare for COL releases
2 stars 0 forks source link

CoL of March 2023 #220

Closed yroskov closed 1 month ago

yroskov commented 1 year ago

Other 7 of 11 GSDs for updates: https://docs.google.com/spreadsheets/d/1WRxfYQF8h2Xiu3gNUvQ3de-zRH-BSq3wYWIAbHHBrhY/edit?usp=sharing

sergbolshakov commented 1 year ago

Hello, everyone.

In the classification of Fungi (https://www.checklistbank.org/dataset/2073/), there are several taxa that do not exist. For a mycologist familiar with the modern classification of fungi, these look like unfortunate typos that have no place on such a reputable source of knowledge. I have not yet written directly to Paul Kirk, perhaps it should be passed on from you so that this can be corrected as soon as possible.

When preparing Index Fungorum data for export to CoL, one would have to include checks to see if the taxon name used in the classification at any rank is present among all existing names in IF db.

According to the results of the check (I do it in R with a snapshot of the full IF database, obtained by the API in 2023-01-22), these names are not exist:

And two more orthographic variants, for which there are accepted variants:

They probably arose when the authors of the new taxa filled in the corresponding fields of the classification. It is desirable to ensure validation of the data entered to prevent such cases.

Also, in the current classification there are cases when the same taxon appears in different parent taxa:

   rank   name                    parent                
   <chr>  <chr>                   <chr>                 
 1 class  Agaricomycetes          Basidiomycota         
 2 class  Agaricomycetes          Agaricomycota         
 3 class  Kickxellomycetes        Mucoromycota          
 4 class  Kickxellomycetes        Kickxellomycota       
 5 order  Chaetothyriales         Eurotiomycetes        
 6 order  Chaetothyriales         Chaetothyriomycetes   
 7 order  Cladochytriales         Chytridiomycetes      
 8 order  Cladochytriales         Cladochytriomycetes   
 9 order  Helotiales              Leotiomycetes         
10 order  Helotiales              Dothideomycetes       
11 family Apiosporaceae           Incertae sedis        
12 family Apiosporaceae           Xylariales            
13 family Clypeosphaeriaceae      Amphisphaeriales      
14 family Clypeosphaeriaceae      Xylariales            
15 family Dissoconiaceae          Capnodiales           
16 family Dissoconiaceae          Mycosphaerellales     
17 family Kirschsteiniotheliaceae Pleosporales          
18 family Kirschsteiniotheliaceae Kirschsteiniotheliales
19 family Septochytriaceae        Cladochytriales       
20 family Septochytriaceae        Chytridiales

According to the current accepted classification of Fungi*, the parental taxa should be:

And still missing all the intraspecific tautonyms of accepted names — see example below. To solve this problem, it is obviously necessary to assign the same acceptedNameID to all taxa with the same basionymID. However, in some cases for infraspecific tautonyms there are different basionymIDs for some reason.

sqlite> SELECT
   ...>   name_of_fungus
   ...>   , infraspecific_epithet
   ...>   , basionym_record_number
   ...>   , current_name_record_number
   ...> FROM if_raw
   ...> WHERE (name_of_fungus != 'UNPUBLISHED NAME')
   ...>   AND (accessrights IS NULL)
   ...>   AND (name_of_fungus LIKE 'Agaricus bisporus%')
   ...> ORDER BY
   ...>   basionym_record_number
   ...> ;
name_of_fungus                          infraspecific_epithet  basionym_record_number  current_name_record_number
--------------------------------------  ---------------------  ----------------------  --------------------------
Agaricus bisporus var. perrubescens     perrubescens           117700                  531546
Agaricus bisporus f. microspora         microspora             124277                  531546
Agaricus bisporus                                              267375                  531546
Agaricus bisporus                                              267375                  531546
Agaricus bisporus                                              267375                  531546
Agaricus bisporus f. bisporus           bisporus               267375
Agaricus bisporus var. bisporus         bisporus               267375
Agaricus bisporus var. albidus          albidus                348995                  531546
Agaricus bisporus var. avellaneus       avellaneus             348996                  531546
Agaricus bisporus f. conicopodus        conicopodus            348997                  531546
Agaricus bisporus f. depressus          depressus              348998                  531546
Agaricus bisporus f. langei             langei                 353243                  531546
Agaricus bisporus var. burnettii        burnettii              357921                  531546
Agaricus bisporus var. eurotetrasporus  eurotetrasporus        466019                  531546

*Wijayawardene N. N., Hyde K. D., Dai D. Q., Sánchez-García M., Goto B. T., Saxena R. K., Erdoğdu M., Selçuk F., Rajeshkumar K. C., Aptroot A., Błaszkowski J., Boonyuen N., da Silva G. A., de Souza F. A., Dong W., Ertz D., Haelewaters D., Jones E. B. G., Karunarathna S. C., Kirk P. M., Kukwa M., Kumla J., Leontyev D. V., Lumbsch H. T., Maharachchikumbura S. S. N., Marguno F., Martínez-Rodríguez P., Mešić A., Monteiro J. S., Oehl F., Pawłowska J., Pem D., Pfliegler W. P., Phillips A. J. L., Pošta A., He M. Q., Li J. X., Raza M., Sruthi O. P., Suetrong S., Suwannarach N., Tedersoo L., Thiyagaraja V., Tibpromma S., Tkalčec Z., Tokarev Y. S., Wanasinghe D. N., Wijesundara D. S. A., Wimalaseana S. D. M. K., Madrid H., Zhang G. Q., Gao Y., Sánchez-Castro I., Tang L. Z., Stadler M., Yurkov A., Thines M. 2022. Outline of Fungi and fungus-like taxa – 2021 // Mycosphere 13(1): 53–453. https://doi.org/10.5943/mycosphere/13/1/2

sergbolshakov commented 1 year ago

Also, judging by the issues found for this dataset, it seems that filters for orthographic variants were not enabled during export. I filtered them by flag sts_flag == "o", and got 36874 names. Turning this flag on will eliminate most issues, not only Duplicate Name

redewalt commented 1 year ago

Sergey,

We are grateful for you are looking over our shoulders. It does not look like others are responding. Yuri is on vacation right now. I am sure he will get back to you soon.

Ed DeWalt Species File Group


From: Sergey Bolshakov @.> Sent: Monday, February 20, 2023 4:54 AM To: CatalogueOfLife/testing @.> Cc: Subscribed @.***> Subject: Re: [CatalogueOfLife/testing] CoL of March 2023 (Issue #220)

Also, judging by the issueshttps://urldefense.com/v3/__https://www.checklistbank.org/dataset/2073/issues__;!!DZ3fjg!7LYgWo07TcUEPZkzWUl40nNKfL1GSkrebYZd4Nq2ywwweFn_GZEYvUHLu_8iEcosrNTAZVgCT5xCaqMx1J5lMjyL9Q$ found for this dataset, it seems that filters for orthographic variants were not enabled during export. I filtered them by flag sts_flag == "o", and got 36874 names. Turning this flag on will eliminate most issues, not only Duplicate Name

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/CatalogueOfLife/testing/issues/220*issuecomment-1436743226__;Iw!!DZ3fjg!7LYgWo07TcUEPZkzWUl40nNKfL1GSkrebYZd4Nq2ywwweFn_GZEYvUHLu_8iEcosrNTAZVgCT5xCaqMx1J5jNmOXkw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AHLQ2GUL2A73Q4KCB4GKWCLWYNEQBANCNFSM6AAAAAAUW7EG74__;!!DZ3fjg!7LYgWo07TcUEPZkzWUl40nNKfL1GSkrebYZd4Nq2ywwweFn_GZEYvUHLu_8iEcosrNTAZVgCT5xCaqMx1J5TWguMbw$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

yroskov commented 1 year ago
yroskov commented 1 year ago
yroskov commented 1 year ago
yroskov commented 1 year ago
yroskov commented 1 year ago

2023-03-08, TASKS in World Ferns: https://github.com/CatalogueOfLife/backend/issues/1182#issuecomment-1460798017 Report ACC-SYN species (different accepted, different authors) 70 of 369 fails to open.

However, World Plants, report ACC-SYN species (different accepted, different authors) 5311 of 6242 opens without problem.

yroskov commented 1 year ago

PREVIEW release started 2023-03-09, 5:39 pm (server time) Finished as COL23.3, 2023-03-09, id 9882 Deployed to the PREVIEW website 2023-03-09

CHECKS 2023-03-09:

GLI & Tortricidae re-synced 2023-03-09. = stats become correct in the next preview.

yroskov commented 1 year ago

PREVIEW release started 2023-03-09, 8:43 pm (server time) Finished as COL23.3, 2023-03-09, id 9883 Deployed to the preview website 2023-03-10

yroskov commented 1 year ago

@olafbanki & @mdoering, after my today's checks, I can submit COL23.3, 2023-03-09, id 9883 for deployment as CoL of March.

yroskov commented 1 year ago

2023-03-21: deployed to the portal.