Open yroskov opened 3 years ago
TASKS 2020-12-10
2020-12-11 Tasks are not completed because of clearinghouse malfunction: see reports https://github.com/CatalogueOfLife/backend/issues/933 and https://github.com/CatalogueOfLife/checklistbank/issues/778.
State for today:
Unfinished tasks: SYN-SYN species (diff acc, same auth) SYN-SYN infra (diff acc, same auth) Manuscript Names (!must be done: filter Accepted names + Without Decision).
ISSUES
Rank Invalid (46), all resolved via complex decisions: nothossp. = subspecies agamossp. = subspecies nvar. = variety b. = variety
Rematch of broken decisions failed: https://data.catalogueoflife.org/catalogue/3/decision?broken=true&limit=100&offset=0&subjectDatasetKey=1141
ISSUES
243 names with second epithet "tepanek" are fake. 1) Clearinghouse misinterprets binomials with authorstrings containing name Å tepanek as trinomial. 2) WP crawler misinterprets author, which appeared in the source file as Å tepánek (supposed to be Štěpánek), as Å tepanek. (@gdower)
see: https://github.com/CatalogueOfLife/testing/issues/5 https://github.com/CatalogueOfLife/backend/issues/936
Other misinterpreted author names: (@gdower) Other cases: Bupleurum × sitenskyi ourkova Å = Bupleurum × sitenskyi Šourková Knautia serpentinicola tech Smejkal ex Kolar, Z. Kaplan, J. Suda & Å = Knautia serpentinicola Smejkal ex Kolář, Z. Kaplan, J. Suda et Štech Papaver victoris kornik Å = Papaver victoris Škornik & Wraber
2020-12-17: re-crawled (mis-encoded characters resolved by the crawler) & re-imported
TASKS
Resolved:
Remaining authors with mis-coded characters (as result, empty authorstrings): many ?irj. = Širj. Kra?an Chenopodium strictum ssp. striatum (Kra?an) Aellen & Iljin
It's not a blocker for now. Will be fully resolved in Michael's master file, and, then, in CoL.
Subspecies Assigned 1) All names with Štepánek are resolved now. No names with second epithet "tepanek" in the list. 2) 4 acc names resolved via complex decisions. 3) False trinomial Taraxacum triangulare fl H. Lindb. corrected. In the source: Taraxacum triangulare H. Lindb.fl.
Synced 2020-12-17
World Plants of 2021-01-04:
As appears on original website:
It would be good to have the Rumicastrum comment from the website also available as a taxon comment in coldp/clb/col
World Plants of 2021-01-07
TASKS of 2021-01-08:
Resolved:
See also: https://github.com/CatalogueOfLife/backend/issues/942
World Plants of 2021-01-07 synced on 2021-01-08
New version of file 481.csv integrated in WP of 2021-01-07.
2021-01-12: WP re-imported & synced.
World Plants of 2021-03-02 received and imported (id 1141). third update since AC19 Metadata: ver. & date changed as 2021-03-02.
TASKS 2021-03-03
Broken decisions, 3706 "Rematch all decisions" = Request failed with status code 503. repeated twice. Ticket: https://github.com/CatalogueOfLife/backend/issues/968 3706 -> 1658 remain Deleted all broken decisions (1658).
Resolved:
ISSUES 2021-03-03
@mdoering Why so many sectors are broken, if classification in the source dataset was not changed? Why option "Rematch all sectors" increased number of broken sectors?
Geoff "Rematched all sectors" again. Results: 0 - 38 of 38 Rematch succeded Sectors Broken: 38Updated: 0Unchanged: 140Total: 178
What's happened to the system? https://github.com/CatalogueOfLife/backend/issues/969
Geoff fixed the problem with his tools.
World Plants of 2021-03-02 synced 2021-03-03
World Plants of 2021-03-17 received and imported (id 1141). fourth update since AC19 Metadata: ver. & date changed as 2021-03-17.
ISSUES
TASKS 2021-03-18
Resolved:
World Plants of 2021-03-17 synced 2021-03-18.
World Plants of 2021-05-10 imported 2021-05-12.
TASKS
Resolved 2021-05-19:
World Plants of 2021-05-10 synced 2021-05-19
Data re-imported after replacement of 375a file 2021-05-25
YR: A new full export is requested.
TASKS
Resolved 2021-05-25:
World Plants of 2021-05-25 imported on https://data.catalogueoflife.org/dataset/1141/classification
(At! it includes fern file, but it should not affect final data view)
ISSUES assessed 2021-05-26
TASKS
Resolved 2021-05-26:
World Plants of 2021-05-25 synced 2021-05-26
World Plants of 2021-08-06. First update after ac21.
Imported: 194,297 spp (vs 193,377 in ac21)
Sector: family Surianaceae (& Gesneriaceae ?) broken
TASKS
Resolved:
Synced 2021-08-23
World Plants of 2021-10-08. Second update after ac21.
[x] Imported: 194,534 spp
[x] Metadata: updated version & date
[x] Sector: family Surianaceae broken = sector re-established
TASKS
Resolved 2021-10-22:
Synced 2021-10-22
World Plants of 2021-11-27 imported 2021-11-30. Third update after ac21.
Strange situation: family Eupteleaceae is present in assembly tree (marked as an independent sector there (!)).
Steps:
TASKS
[x] Manuscript names without decisions: 24 new unresolved acc names = resolved.
Piper politifolium Callejas; Fl. Mesoamer. 2(2): 353 (2020), nom. nov. illeg., non C. DC. 1910 - absent in IPNI
Resolved 2021-11-30:
Synced 2021-12-01
See https://github.com/CatalogueOfLife/checklistbank/issues/973#issuecomment-985005903
The name Ampelocera hondurensis is appeared on WP website as an accepted in Cannabaceae and as a synonym in Achatocarpaceae. Would it be correct?
2021-12-02: TASK reports ("print to PDF") on ACC-SYN species (different accepted, same authors) SYN-SYN species (different accepted, same authors) ACC-SYN infraspecific taxa (different accepted, same authors) SYN-SYN infraspecific taxa (different accepted, same authors) have been sent to the author for assessment and possible fixes.
World Plants of 2021-12-06 is a replacement for 2021-11-27 imported 2021-12-09. Third update after ac21.
TASKS
Resolved:
Synced 2021-12-10
Checks of World Plants 2021-12-06 on PREVIEW 2021-12-14
Example:
Expected data (as on PROD):
World Plants of 2022-01-08 imported 2022-01-11. 4th update after ac21. (after duplicate fixes)
TASKS
Resolved 2022-01-11
Synced 2022-01-11
World Plants 12.10 of 2022-02-18 imported 2022-02-21; re-imported 2022-02-24. 5th update after ac21.
[x] Imported 194,780 spp (vs 193,341 in 2022-01-08)
[x] Metadata: version 12.10, Feb 2022 & date 2022-02-18 updated.
[x] Sectors: broken 33 families: Achariaceae, Balanopaceae, Bonnetiaceae, Calophyllaceae, Chrysobalanaceae, Clusiaceae, Ctenolophonaceae, Dichapetalaceae, Elatinaceae, Erythroxylaceae, Euphroniaceae, Goupiaceae, Humiriaceae, Hypericaceae, Irvingiaceae, Ixonanthaceae, Lacistemataceae, Linaceae, Lophopyxidaceae, Malesherbiaceae, Malpighiaceae, Medusagynaceae, Ochnaceae, Passifloraceae, Podostemaceae, Polygalaceae, Quiinaceae, Rafflesiaceae, Rhizophoraceae, Salicaceae, Trigoniaceae, Turneraceae, Violaceae
Rematch all sectors = broken 37 as result: 35 families Achariaceae, Argophyllaceae, Balanopaceae, Bonnetiaceae, Calophyllaceae, Chrysobalanaceae, Clusiaceae, Ctenolophonaceae, Dichapetalaceae, Elatinaceae, Erythroxylaceae, Euphroniaceae, Goupiaceae, Humiriaceae, Hypericaceae, Irvingiaceae, Ixonanthaceae, Lacistemataceae, Linaceae, Lophopyxidaceae, Malesherbiaceae, Malpighiaceae, Medusagynaceae, Menyanthaceae, Ochnaceae, Passifloraceae, Phellinaceae, Podostemaceae, Quiinaceae, Rafflesiaceae, Rhizophoraceae, Salicaceae, Trigoniaceae, Turneraceae, Violaceae, and 2 genera Astragalus & Chesniella
The source file investigated: the line for order Malpighiales is missing in source files. As a result, all children of Malpighiales dropped in Oxalidaceae. 2022-02-23 request for a new export sent to Michael. 2022-02-23 file 216_new.csv received. However, why Argophyllaceae-Asterales, Astragalus & Chesniella-Fabaceae are broken? = Re-matched manually.
2022-02-24 after re-import with repaired file: 3 broken sectors: genus Oxytropis, families Phellinaceae & Menyanthaceae = FIXED
TASKS
Resolved 2022-02-24:
Synced 2022-02-24
World Plants 12.10 of 2022-02-18 - continue after fixes https://github.com/CatalogueOfLife/testing/issues/184#issuecomment-1055567444
TASKS 2022-03-03
Resolved 2022-03-03:
Synced 2022-03-03
2022-04-08: Overlap between WSVP (Araliaceae) & World Plants (Apiaceae) in two genera Hydrocotyle, Trachymene after update of WCVP. = subfamily Hydrocotyloideae with these 2 genera blocked in World Plants; synced.
World Ferns, 13.0, May 2022 of 2022-05-29 imported 2022-06-02
Families Casuarinaceae, Juglandaceae, Myricaceae are missing in the export of 2022-05-29. Our system will keep data from your previous export for these families.
ISSUES assessed 2022-06-03
TASKS
Resolved 2022-06-03
Synced 2022-06-03
World Plants, ver. 13.2 of 2022-07-02 received on 2022-07-05; imported 2022-07-06
Mix of Malpighiales and Oxalidales?
See also family Surianaceae in Fabales & Hydrostachyaceae in Cornales
New version of 216.csv file received on 2022-07-07; imported 2022-07-08
TASKS 2022-07-11
Resolved 2022-07-11
Synced 2022-07-11
World Plants, ver 14.3, Nov 2022, received 2022-11-08 (replacement for 2022-10-15); imported 2022-11-10
ISSUES
TASKS
World Plants, ver 14.3, Nov 2022; 3 files updated; re-imported 2022-11-15
ISSUES assessed 2022-11-16
TASKS
Resolved 2022-11-16:
Synced 2022-11-16
World Plants, ver 14.7, Jan 2023; received 2023-01-12; imported 2023-01-13; re-imported 2023-01-24; re-imported 2023-02-07
ISSUES assessed 2023-01-19
TASKS
Resolved 2023-01-19:
Synced 2023-01-19
[x] 376_new received 2023-01-21; updated WP imported 2023-01-24; re-synced 2023-02-03
[x] 376_new2 received 2023-01-24; re-imported 2023-02-07; nested Caryophyllales re-synced 2023-02-07
[x] MH: pls change authorstring Ranunculus crenatus Descr. -> Ranunculus crenatus Waldst. & Kit. = FIXED 2023-02-06; Ranunculales re-synced 2023-02-07
World Plants, ver 15, Mar 2023; received 2023-03-06; imported 2023-03-07; new version received 2023-03-07; imported 2023-03-08
MH:
The only (minor) issue is that in this way I have created undoubtedly have created quite a lot of duplicates in the synonyms. I would be grateful if you could run your “duplicate checker” for me.
As I understood, you are going to do a cleaning for duplicated names. If I complete my routine “resolution” for duplicates now (i.e. give a flag “ambiguous synonym”), it may compromise name statuses in the next update. I can see two possible solutions: we can wait for another update from you (late March/early April?) or we are updating data now without resolution for duplicates. What you would prefer?
MH: updating data now without resolution for duplicates
TASKS
As agreed with Michael, no resolution for duplicates in this version.
Synced 2023-03-03-09
@dhobern, for attention of Taxonomy Group: Michael Hassler offered to replace Annonaceae (AnnonBase) and Droseraceae (Droseraceae Database) checklists with data from World Plants:
I think for both families my data are up to date. But 10 years old data are not really good any more, unfortunately. In both families there is lots of taxonomic activity.
Thanks, @yroskov - I'll add these to the next agenda.
@yroskov maybe meanwhile one can change the AnnonBase metadata to point to http://www.annonaceae.org/ The current link does not work any longer.
@yroskov maybe meanwhile one can change the AnnonBase metadata to point to http://www.annonaceae.org/ The current link does not work any longer.
Changed
See also: https://github.com/CatalogueOfLife/testing/issues/23 https://github.com/CatalogueOfLife/testing/issues/26 https://github.com/CatalogueOfLife/testing/issues/24 https://github.com/CatalogueOfLife/testing/issues/19 and https://github.com/CatalogueOfLife/testing/issues/5 https://github.com/CatalogueOfLife/testing/issues/95 https://github.com/CatalogueOfLife/testing/issues/96
Metadata patched: