CatalogueOfLife / testing

Editorial tests and discussion to prepare for COL releases
2 stars 0 forks source link

CoL of February 2022 #181

Closed yroskov closed 6 months ago

yroskov commented 2 years ago

Other 11 GSDs for updates: https://docs.google.com/spreadsheets/d/1WRxfYQF8h2Xiu3gNUvQ3de-zRH-BSq3wYWIAbHHBrhY/edit?usp=sharing

yroskov commented 2 years ago

Do this first on DEV as an experiment https://github.com/CatalogueOfLife/testing/issues/182. = Task completed on DEV successfully. However, there are no realms yet in ICTV-CoLDP.

Stopper on PROD of 2022-02-04: I am not able to assemble virus sectors in CLB on PROD (https://github.com/CatalogueOfLife/checklistbank/issues/1000) = FIXED

yroskov commented 2 years ago

Attention! 2022-02-04: NEED TO BE FIXED BEFORE RELEASE OF FEB EDITION

yroskov commented 2 years ago

WCSP: family: Acoraceae - order: Acorales = FIXED, synced

ITIS superfamily: Cloacaroidea - infraorder: Eleutherengona = NOT BROKEN, no action family: Cybocephalidae - superfamily: Cucujoidea = NOT BROKEN, no action family: Macronyssidae - superfamily: Dermanyssoidea = NOT BROKEN, no action

yroskov commented 2 years ago
yroskov commented 2 years ago

2022-02-16: new SF Isoptera export didn't fix a problem with two genera outside families: https://github.com/CatalogueOfLife/testing/issues/171#issuecomment-1042342701

FIXED. New version synced 2022-02-17

dhobern commented 2 years ago

Thanks, Yury.

Yes to Alucitoidea and Pterophoroidea. There are several new names and lots of minor updates (references, etc.).

Gelechiidae and a larger Lepidoptera update should be possible in the following release.

Donald

--

Donald Hobern / @.*** / +61 420511471 Araba Bioscan Project https://stangeia.hobern.net/araba-bioscan-project/ / Pterophoroidea https://pterophoroidea.hobern.net/ / Alucitoidea https://alucitoidea.hobern.net/ / BOLD Australia https://bold-au.hobern.net/ ORCID: 0000-0001-6492-4016 https://orcid.org/0000-0001-6492-4016 / Blog https://stangeia.hobern.net/ / iNaturalist https://inaturalist.ala.org.au/people/dhobern / Flickr https://www.flickr.com/photos/dhobern// GitHub https://github.com/dhobern / Twitter https://twitter.com/dhobern

On Fri, 18 Feb 2022 at 04:17, yroskov @.***> wrote:

-

StaphBase - request for a new version sent; data received 2021-10-21; converting in TW

ITIS of 2022-01-31 (a comment on extra combinations in bird taxa #8 (comment) https://github.com/CatalogueOfLife/testing/issues/8#issuecomment-1027371733); imported 2022-02-15; synced 2022-02-16

WoRMS of 2022-02-01 received; imported 2022-02-15; new version with fixes received 2022-02-17

Scarabs of 2022-02-02 received; imported 2022-02-15;

Species Fungorum Plus - Paul>"For the last few weeks I have been updating Species Fungorum and resolving some internal inconsistencies / errors in preparation for another download for COL+, partly prompted by a request from the GNI for a new download. I may have to get back to you if I forget how to ‘format’ the data, but this may be in early January. Reminded 2022-01-25.

Collembola.org - Anton> "Frans is happy with our (your) activities and happy to share the checklist as long as he doesn't need to do extra work :) To my knowledge he is updating the checklist regularly (~once a month). If I'm correct, Geoff now have a crawling script and can mine the checklist again for the next CoL version. Or do you need anything from my side to make it happen? Reminded 2022-01-25.

WCVP (Markus); imported to the PROD 2021-12-03 (#177 https://github.com/CatalogueOfLife/testing/issues/177). Hybrid genera outside families. 2021-12-22: Sector Acorales - Acoraceae established for the purpose of accessibility of Workbench tool (available inside the Project only). 2022-01-13: Sector WCVP-Acoraceae deleted; WCSP-Acoraceae re-established, synced. Metadata need to be checked with Rafael. 2022-02-02: cannot do assembly of families until this will be fixed: CatalogueOfLife/checklistbank#998 https://github.com/CatalogueOfLife/checklistbank/issues/998 = FIXED updated view imported 2022-02-14;

ICTV: reassemble top level of virus classification in CoL. "Viruses" (unranked) created as a sister group to "Biota" and current ICTV classification re-assembled in this new group. 9 virus kingdoms are included as sectors in Viruses unranked group; however, 6 realms are absent in ICTV-CoLDP dataset. Synced 2022-02-07 and ICTV MSL36, 2020 / 2021-05-18 was included with 6 realms & 9 kingdoms as a single sector entry. Synced 2022-02-11. Re-synced 2022-02-14 after final metadata applied.

SF Isoptera of 2022-02-17; imported after OTU fixes; synced 2022-02-17

Global Gracillariidae might be available in CoLDP; eml from Chantal/François Malherbe of 2021-11-12; first draft on DEV; ver. 2022-01-31 imported on PROD 2022-02-16;

Systema Dipterorum, v 3.6 of 2022-02-14 received;

Pterophoroidea imported 2022-02-09

Alucitoidea imported 2022-02-03 @dhobern https://github.com/dhobern, would it be OK with you to sync new versions of Pterophoroidea & Alucitoidea in CoL of Feb in next few days?

Gelechiidae (new, not adopted yet, https://data.catalogueoflife.org/dataset/2362/names) imported 2022-02-11

Other 11 GSDs for updates: https://docs.google.com/spreadsheets/d/1WRxfYQF8h2Xiu3gNUvQ3de-zRH-BSq3wYWIAbHHBrhY/edit?usp=sharing

— Reply to this email directly, view it on GitHub https://github.com/CatalogueOfLife/testing/issues/181, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGHP4ZSYA3H4JRWIZOKC2RDU3UUS5ANCNFSM5MZBSJHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

yroskov commented 2 years ago

2022-02-17 Preview release started at 17.07 (Champaign); in progress (indexing) 9.03 am completed at 9.03 next day = 16h

So, processing of CoL release now takes 16 h

yroskov commented 2 years ago

Checks of CoL 2022-02-17 preview (deployed 2022-02-18) at https://preview.catalogueoflife.org/

@gdower, @mdoering, how we can find what (sector?) might be missing in these two GSDs?

mdoering commented 2 years ago

Good question. The source overview tells us we lost one sector in WorldPlants & 1 in ITIS: https://data.catalogueoflife.org/catalogue/3/sourcemetrics?hideUnchanged=true&releaseKey=2368

But which that is I can only dig out via the database or API. @thomasstjerne we could setup some tool to do that analysis in the UI...

yroskov commented 2 years ago

Project - Sectors report shows 0 broken sectors in ITIS and 2 in WPlants (Surianaceae (expected & not causing spp loss) & Polygalaceae (I'll investigate it).

WPlants Surianaceae & Polygalaceae: all species are present in the Preview. So, it's not a problem.

mdoering commented 2 years ago

ITIS Sectors in release 2368 not existing in project:

  id  | subject_rank | subject_name 
------+--------------+--------------
  867 | SUBORDER     | Archostemata
 1447 | SUPERFAMILY  | Cheyletoide

Otherway around, sectors in project not in 2368 release: 1496 | SUBORDER | Archostemata

mdoering commented 2 years ago

so you deleted and recreated Archostemata and removed Cheyletoide.

mdoering commented 2 years ago

Removed in World Plants:

id | subject_rank | subject_name ------+--------------+-------------- 1449 | FAMILY | Juncaceae

yroskov commented 2 years ago

Thank you Markus! It's helpful. Now I can fix missing sectors.

yroskov commented 2 years ago

New preview release was started 2022-02-18 at 12.13 Champaign.

yroskov commented 2 years ago

2022-02-21: preview of 2022-02-18 is OK.

CoL of 2022-02-18 is ready for deployment to the production website. (@mdoering). Email sent to Markus & Olaf.