CatalogueOfLife / testing

Editorial tests and discussion to prepare for COL releases
2 stars 0 forks source link

WLoC, World List of Cycads (id 1163): test report #267

Open yroskov opened 3 months ago

yroskov commented 3 months ago

The export received from Michael Calonje on 2024-07-11 as _World_List_of_Cycads_Export20240711190713.xlsx export file from BRAHMS; plus _Genera_Export20240712110746.xlsx and _Families_Export20240712110714.xlsx with authors for genera and families.

Export files converted by @mdoering. Imported on DEV https://www.dev.checklistbank.org/dataset/1163/imports

yroskov commented 3 months ago

Imported on DEV 2024-07-24, iteration 1 https://www.dev.checklistbank.org/dataset/1163/imports

Metrics:

image

Note: number in square brackets in CalcFullName in the source file indicates homonyms (e.g. Ceratozamia fuscoviridis [1] & Ceratozamia fuscoviridis [2])

mdoering commented 3 months ago

I have just uploaded a new version which improves handling of nomen dubium, invalid & illegitimate names. I will stop here and leave it as it is during my absence. So you can review

Now all accepted names are placed in the genus, incl infraspecific names. Doubtful and invalid names are treated as bare names, unless they have an explicit synof relationship (which is rare)

yroskov commented 3 months ago

Thanks! I'll continue with a new iteration.

Seems, proper tests and consultations with WLoC will take a time. Have a nice vacations!

mdoering commented 3 months ago

I see infraspecifc names lack a proper rank. I will fix that quickly and upload. 5 minutes...

mdoering commented 3 months ago

searching for tridentata gives you a nice idea of bare names vs synonyms

yroskov commented 3 months ago

Imported on DEV 2024-07-24, iteration 3

Metrics:

image

yroskov commented 3 months ago

Iteration 3 (imported on DEV 2024-07-24, 4:14 PM ) summary of remaining issues

= After discussion with Michael: KEEP THESE 10 in COL AS THEY ARE NOW IN ITERATION 3 (i.e. as synonyms)

Example: Encephalartos caffer (Thunb.) Lehm. in DEV https://www.dev.checklistbank.org/dataset/1163/taxon/8784b87c-e197-4aea-bb62-8b55abae668d

Encephalartos caffer (Thunb.) Lehm. in WLoC http://cycadlist.org/taxon.php?Taxon_ID=280 Encephalartos cycadis Sweet INVALID in WLoC http://cycadlist.org/taxon.php?Taxon_ID=624

Here is a list of 10 names: image

Zamia nana [2] Macrozamia pectinata Macrozamia littoralis Encephalartos villosus var. nobilis Encephalartos douglasii Macrozamia hopei [1] Encephalartos pumilus Zamia parasitica Encephalartos latifolius Encephalartos cycadis

yroskov commented 3 months ago

Iteration 3 (imported on DEV 2024-07-24, 4:14 PM ) summary of remaining issues

image

yroskov commented 3 months ago

Iteration 3 (imported on DEV 2024-07-24, 4:14 PM ) summary of remaining issues

ISSUES

image

yroskov commented 3 months ago

Iteration 3 (imported on DEV 2024-07-24, 4:14 PM ) summary of remaining issues

ISSUES

In the Tree, they appear in the parent genus, i.e. outside existing species. For example: https://www.dev.checklistbank.org/dataset/1163/classification?taxonKey=210f828b-069f-456b-9afc-d82421a62ae4 image

mdoering commented 2 months ago

I have fixed the missing species issue for infraspecies, but there are still synonyms missing an accepted name. The original data is lacking some information, e.g. Cycas longipetiolula claims to have the accepted name Cycas bifida x c. multipinnata with id e781d518-6d82-4dfb-be67-893e1d026cff, both which do not exist in the spreadsheet. Not much I can do here.

The original website does have that info: http://cycadlist.org/taxon.php?Taxon_ID=151

Another example of a missing species referenced from synonyms is

5eafe3fa-bee5-4fa3-be3f-480cb7c08ff1 Phoenix dactylifera

mdoering commented 2 months ago

For the references I am uncertain if adding the nomenclatural author is improving the data. It is not a botanical practice you find on the source site, IPNI, WFO or others.

Does it hurt to leave it as it is?

mdoering commented 2 months ago

I have uploaded the latest version with no reference change to prod - where I only updated the version/issued date metadata

yroskov commented 2 months ago

It is CoL standard for building bibliographic reference from nomenclatural citation in the botanical scientific name. If you need help on how to do it, please talk to @gdower. He completed this for many botanical GSDs.

yroskov commented 2 months ago

I have uploaded the latest version

Thank you, Markus!

mdoering commented 2 months ago

ok, I have tried my best to also change the references. It's on prod already.

mdoering commented 2 months ago

Damn, there are some problems with the references. Going to fix that ... https://www.checklistbank.org/dataset/1163/references?limit=50&offset=0

mdoering commented 2 months ago

Now they look good to me: https://www.checklistbank.org/dataset/1163/references?limit=50&offset=0 I stop here

yroskov commented 2 months ago

Wow! Perfect. Many thanks!

yroskov commented 2 months ago

IPNI, WFO or others

Yeah, it would be nice if all these resources send us properly prepared bibliographic references (and with page/publicaion URLs). Unfortunately, most of them are far behind their users expectations. That's why we at CoL do our best to improve their data slightly and temporarily.

yroskov commented 2 months ago

WLoC: World List of Cycads of 2024-07-11; imported 2024-08-14 (converted by Markus)

Metrics

image

ISSUES assessed 2024-08-15

image

TASKS

image

Resolved 2024-08-15:

image

Synced 2024-08-15

See colon in Target - will this affect results of sync?

image

yroskov commented 2 months ago

So far, deletion of the sector failed: "with status code 503. The SyncManager is currently not available"

image

mdoering commented 2 months ago

yes, see slack. I had to do some urgent changes in the db and stop all write operations for the next 1-2 hours probably. Wasn't expected, sorry

yroskov commented 2 months ago

image

Resolved:

Step 1. Delete sector in full. Node disappeared in the Tree, but software says Request failed with status code 404. Sector 3:98 does not exist. HTTP method: GET. https://api.checklistbank.org/dataset/3/sector/98 Step 2. Establish class Cycadopsida Brongn. (it was vanished) Step 3. Establish order Cycadales inside Cycadopsida as a sector

Re-synced 2024-08-15