CatalogueOfLife / testing

Editorial tests and discussion to prepare for COL releases
2 stars 0 forks source link

World Plants (id 1141): test report #4

Open yroskov opened 3 years ago

yroskov commented 3 years ago

See also: https://github.com/CatalogueOfLife/testing/issues/23 https://github.com/CatalogueOfLife/testing/issues/26 https://github.com/CatalogueOfLife/testing/issues/24 https://github.com/CatalogueOfLife/testing/issues/19 and https://github.com/CatalogueOfLife/testing/issues/5 https://github.com/CatalogueOfLife/testing/issues/95 https://github.com/CatalogueOfLife/testing/issues/96

Metadata patched:

image

yroskov commented 3 years ago

TASKS 2020-12-10 image

2020-12-11 Tasks are not completed because of clearinghouse malfunction: see reports https://github.com/CatalogueOfLife/backend/issues/933 and https://github.com/CatalogueOfLife/checklistbank/issues/778.

State for today: image

Unfinished tasks: SYN-SYN species (diff acc, same auth) SYN-SYN infra (diff acc, same auth) Manuscript Names (!must be done: filter Accepted names + Without Decision).

yroskov commented 3 years ago

ISSUES

Rank Invalid (46), all resolved via complex decisions: nothossp. = subspecies agamossp. = subspecies nvar. = variety b. = variety

Also, https://github.com/CatalogueOfLife/backend/issues/931

yroskov commented 3 years ago

Rematch of broken decisions failed: https://data.catalogueoflife.org/catalogue/3/decision?broken=true&limit=100&offset=0&subjectDatasetKey=1141

yroskov commented 3 years ago

ISSUES

Subspecies Assigned https://data.catalogueoflife.org/catalogue/3/dataset/1141/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&issue=subspecies%20assigned&limit=1000&offset=0&reverse=false

243 names with second epithet "tepanek" are fake. 1) Clearinghouse misinterprets binomials with authorstrings containing name Å tepanek as trinomial. 2) WP crawler misinterprets author, which appeared in the source file as Å tepánek (supposed to be Štěpánek), as Å tepanek. (@gdower)

see: https://github.com/CatalogueOfLife/testing/issues/5 https://github.com/CatalogueOfLife/backend/issues/936

Other misinterpreted author names: (@gdower) Other cases: Bupleurum × sitenskyi ourkova Å = Bupleurum × sitenskyi Šourková Knautia serpentinicola tech Smejkal ex Kolar, Z. Kaplan, J. Suda & Å = Knautia serpentinicola Smejkal ex Kolář, Z. Kaplan, J. Suda et Štech Papaver victoris kornik Å = Papaver victoris Škornik & Wraber

yroskov commented 3 years ago

2020-12-17: re-crawled (mis-encoded characters resolved by the crawler) & re-imported

TASKS image

Resolved: image

yroskov commented 3 years ago

Remaining authors with mis-coded characters (as result, empty authorstrings): many ?irj. = Širj. Kra?an Chenopodium strictum ssp. striatum (Kra?an) Aellen & Iljin

It's not a blocker for now. Will be fully resolved in Michael's master file, and, then, in CoL.

yroskov commented 3 years ago

Subspecies Assigned 1) All names with Štepánek are resolved now. No names with second epithet "tepanek" in the list. 2) 4 acc names resolved via complex decisions. 3) False trinomial Taraxacum triangulare fl H. Lindb. corrected. In the source: Taraxacum triangulare H. Lindb.fl.

yroskov commented 3 years ago

Synced 2020-12-17

yroskov commented 3 years ago

World Plants of 2021-01-04:

As appears on original website: image

mdoering commented 3 years ago

It would be good to have the Rumicastrum comment from the website also available as a taxon comment in coldp/clb/col

yroskov commented 3 years ago

World Plants of 2021-01-07

TASKS of 2021-01-08: image

Resolved: image

See also: https://github.com/CatalogueOfLife/backend/issues/942

yroskov commented 3 years ago

World Plants of 2021-01-07 synced on 2021-01-08

yroskov commented 3 years ago

New version of file 481.csv integrated in WP of 2021-01-07.

2021-01-12: WP re-imported & synced.

yroskov commented 3 years ago

World Plants of 2021-03-02 received and imported (id 1141). third update since AC19 Metadata: ver. & date changed as 2021-03-02.

yroskov commented 3 years ago

TASKS 2021-03-03 image

Broken decisions, 3706 "Rematch all decisions" = Request failed with status code 503. repeated twice. Ticket: https://github.com/CatalogueOfLife/backend/issues/968 3706 -> 1658 remain Deleted all broken decisions (1658).

Resolved: image

yroskov commented 3 years ago

ISSUES 2021-03-03

image

yroskov commented 3 years ago

@mdoering Why so many sectors are broken, if classification in the source dataset was not changed? Why option "Rematch all sectors" increased number of broken sectors?

Geoff "Rematched all sectors" again. Results: 0 - 38 of 38 Rematch succeded Sectors Broken: 38Updated: 0Unchanged: 140Total: 178

What's happened to the system? https://github.com/CatalogueOfLife/backend/issues/969

Geoff fixed the problem with his tools.

yroskov commented 3 years ago

World Plants of 2021-03-02 synced 2021-03-03

yroskov commented 3 years ago

World Plants of 2021-03-17 received and imported (id 1141). fourth update since AC19 Metadata: ver. & date changed as 2021-03-17.

ISSUES

yroskov commented 3 years ago

TASKS 2021-03-18

image

Resolved:

image

yroskov commented 3 years ago

World Plants of 2021-03-17 synced 2021-03-18.

yroskov commented 3 years ago

World Plants of 2021-05-10 imported 2021-05-12.

yroskov commented 3 years ago

TASKS

image

Resolved 2021-05-19:

image

yroskov commented 3 years ago

World Plants of 2021-05-10 synced 2021-05-19

yroskov commented 3 years ago

Data re-imported after replacement of 375a file 2021-05-25

YR: A new full export is requested.

yroskov commented 3 years ago

TASKS

image

Resolved 2021-05-25:

image

yroskov commented 3 years ago

World Plants of 2021-05-25 imported on https://data.catalogueoflife.org/dataset/1141/classification

(At! it includes fern file, but it should not affect final data view)

ISSUES assessed 2021-05-26

TASKS

image

Resolved 2021-05-26:

image

World Plants of 2021-05-25 synced 2021-05-26

yroskov commented 3 years ago

World Plants of 2021-08-06. First update after ac21.

Imported: 194,297 spp (vs 193,377 in ac21)

Sector: family Surianaceae (& Gesneriaceae ?) broken

TASKS image

Resolved: image

Synced 2021-08-23

yroskov commented 3 years ago

World Plants of 2021-10-08. Second update after ac21.

TASKS image

Resolved 2021-10-22: image

Synced 2021-10-22

yroskov commented 3 years ago

World Plants of 2021-11-27 imported 2021-11-30. Third update after ac21.

Strange situation: family Eupteleaceae is present in assembly tree (marked as an independent sector there (!)). image

Steps:

  1. Deleted sector family Eupteleaceae
  2. Sector order Ranunculales synced
  3. Result: Eupteleaceae in NotAssigned order. image and in Ranunculales: image
  4. Deleted unwanted entries = FIXED

TASKS image

Resolved 2021-11-30: image

Synced 2021-12-01

yroskov commented 3 years ago

See https://github.com/CatalogueOfLife/checklistbank/issues/973#issuecomment-985005903

The name Ampelocera hondurensis is appeared on WP website as an accepted in Cannabaceae and as a synonym in Achatocarpaceae. Would it be correct?

yroskov commented 2 years ago

2021-12-02: TASK reports ("print to PDF") on ACC-SYN species (different accepted, same authors) SYN-SYN species (different accepted, same authors) ACC-SYN infraspecific taxa (different accepted, same authors) SYN-SYN infraspecific taxa (different accepted, same authors) have been sent to the author for assessment and possible fixes.

yroskov commented 2 years ago

World Plants of 2021-12-06 is a replacement for 2021-11-27 imported 2021-12-09. Third update after ac21.

TASKS image

Resolved: image

Synced 2021-12-10

yroskov commented 2 years ago

Checks of World Plants 2021-12-06 on PREVIEW 2021-12-14

Example: image

Expected data (as on PROD): image

yroskov commented 2 years ago

World Plants of 2022-01-08 imported 2022-01-11. 4th update after ac21. (after duplicate fixes)

TASKS image

Resolved 2022-01-11 image

Synced 2022-01-11

yroskov commented 2 years ago

World Plants 12.10 of 2022-02-18 imported 2022-02-21; re-imported 2022-02-24. 5th update after ac21.

The source file investigated: the line for order Malpighiales is missing in source files. As a result, all children of Malpighiales dropped in Oxalidaceae. 2022-02-23 request for a new export sent to Michael. 2022-02-23 file 216_new.csv received. However, why Argophyllaceae-Asterales, Astragalus & Chesniella-Fabaceae are broken? = Re-matched manually.

2022-02-24 after re-import with repaired file: 3 broken sectors: genus Oxytropis, families Phellinaceae & Menyanthaceae = FIXED

TASKS image

Resolved 2022-02-24: image

Synced 2022-02-24

yroskov commented 2 years ago

World Plants 12.10 of 2022-02-18 - continue after fixes https://github.com/CatalogueOfLife/testing/issues/184#issuecomment-1055567444

TASKS 2022-03-03 image

Resolved 2022-03-03: image

Synced 2022-03-03

yroskov commented 2 years ago

2022-04-08: Overlap between WSVP (Araliaceae) & World Plants (Apiaceae) in two genera Hydrocotyle, Trachymene after update of WCVP. = subfamily Hydrocotyloideae with these 2 genera blocked in World Plants; synced.

yroskov commented 2 years ago
yroskov commented 2 years ago

World Ferns, 13.0, May 2022 of 2022-05-29 imported 2022-06-02

ISSUES assessed 2022-06-03

TASKS image

Resolved 2022-06-03 image

Synced 2022-06-03

yroskov commented 2 years ago
yroskov commented 2 years ago

World Plants, ver. 13.2 of 2022-07-02 received on 2022-07-05; imported 2022-07-06

Mix of Malpighiales and Oxalidales?

See also family Surianaceae in Fabales & Hydrostachyaceae in Cornales

New version of 216.csv file received on 2022-07-07; imported 2022-07-08

TASKS 2022-07-11 image

Resolved 2022-07-11 image

Synced 2022-07-11

yroskov commented 2 years ago

World Plants, ver 14.3, Nov 2022, received 2022-11-08 (replacement for 2022-10-15); imported 2022-11-10

ISSUES

TASKS image

World Plants, ver 14.3, Nov 2022; 3 files updated; re-imported 2022-11-15

ISSUES assessed 2022-11-16

TASKS

image

Resolved 2022-11-16:

image

Synced 2022-11-16

yroskov commented 1 year ago
yroskov commented 1 year ago

World Plants, ver 14.7, Jan 2023; received 2023-01-12; imported 2023-01-13; re-imported 2023-01-24; re-imported 2023-02-07

image

ISSUES assessed 2023-01-19

image

TASKS

image

Resolved 2023-01-19:

image

Synced 2023-01-19

yroskov commented 1 year ago

World Plants, ver 15, Mar 2023; received 2023-03-06; imported 2023-03-07; new version received 2023-03-07; imported 2023-03-08

MH:

The only (minor) issue is that in this way I have created undoubtedly have created quite a lot of duplicates in the synonyms. I would be grateful if you could run your “duplicate checker” for me.

As I understood, you are going to do a cleaning for duplicated names. If I complete my routine “resolution” for duplicates now (i.e. give a flag “ambiguous synonym”), it may compromise name statuses in the next update. I can see two possible solutions: we can wait for another update from you (late March/early April?) or we are updating data now without resolution for duplicates. What you would prefer?

MH: updating data now without resolution for duplicates

image

TASKS

image

As agreed with Michael, no resolution for duplicates in this version.

Synced 2023-03-03-09

yroskov commented 1 year ago

@dhobern, for attention of Taxonomy Group: Michael Hassler offered to replace Annonaceae (AnnonBase) and Droseraceae (Droseraceae Database) checklists with data from World Plants:

I think for both families my data are up to date. But 10 years old data are not really good any more, unfortunately. In both families there is lots of taxonomic activity.

dhobern commented 1 year ago

Thanks, @yroskov - I'll add these to the next agenda.

mdoering commented 1 year ago

@yroskov maybe meanwhile one can change the AnnonBase metadata to point to http://www.annonaceae.org/ The current link does not work any longer.

yroskov commented 1 year ago

@yroskov maybe meanwhile one can change the AnnonBase metadata to point to http://www.annonaceae.org/ The current link does not work any longer.

Changed