CatalogueOfLife / testing

Editorial tests and discussion to prepare for COL releases
2 stars 0 forks source link

CoL of May 2024 #262

Open yroskov opened 2 months ago

yroskov commented 2 months ago

FROM TaxonWorks:

monthly started from January 2024:

monthly started from March 2024:

_single update in a year (January):~

~~SF Coleorrhyncha SF Embioptera SF Grylloblattodea SF Mantophasmatodea SF Zoraptera~~

LEPIDOPTERA: See https://github.com/CatalogueOfLife/testing/issues/260#issuecomment-2074259800

OTHER:

=============================

=============================

Filling gaps:

Suborder Symphyta (Hymenoptera) https://github.com/CatalogueOfLife/data/issues/579 Class/order Diplura https://github.com/CatalogueOfLife/data/issues/577 Order †Permopsocida (Insecta) https://github.com/CatalogueOfLife/data/issues/578 Family Promecheilidae (Tenebrionoidea, Coleoptera) https://github.com/CatalogueOfLife/data/issues/580

dhobern commented 2 months ago

@yroskov More updates to Pterophoridae and GLI - also everything I've listed under "Towards annual checklist"

yroskov commented 2 months ago

PREVIEW release started 2024-04-30, 7:07 pm (server time) Finished as COL24.4, id 295185, 2024-04-30, 8:43 pm Deployed to the preview website 2024-04-30

yroskov commented 2 months ago

PREVIEW release started 2024-04-30, 9:17 pm (server time) Finished as COL24.4, id 295188, 2024-04-30, 10:47 pm Deployed to the preview website 2024-05-01

yroskov commented 2 months ago

PREVIEW release started 2024-05-07, 12:43 pm (server time) Finished as COL24.5, id 295753, 2024-05-07, 2:12 pm Deployed to the preview website 2024-05-07

mdoering commented 2 months ago

can we sync all sources in need of an update as given here? https://www.checklistbank.org/catalogue/3/sources

.... with the exception of IRMNG maybe

yroskov commented 2 months ago

No.

All necessary syncs are already completed. Those sectors which are not synced may have unresolved issues and broken decisions.

mdoering commented 2 months ago

All SpeciesFiles sources have issues?

mdoering commented 2 months ago

Pterophoroidea has added 6 species. Is this problematic? https://www.checklistbank.org/dataset/1199/diff?attempts=162..163

yroskov commented 2 months ago

Checks & syncs of SFs are scheduled for next week

yroskov commented 2 months ago

Please check problems with GSDs in Github reports

mdoering commented 2 months ago

I did look at 2 just now, Bryonames and Gracillariidae. I don't see any recent comments on those. They just seem to not have been synced for quite some time.

Both do update automatically whenever sources change. Hence it is crucial to look at the attempts in sources to see what is in need of a sync. At least as long as we do manual syncs.

mdoering commented 2 months ago

Pterophoroidea also seems fine? There is just a software issue mentioned which I think is fixed long time ago: https://github.com/CatalogueOfLife/testing/issues/179

dhobern commented 2 months ago

Five of the six extra Pterophoridae are because the new North American checklist has split some species. What's less clear to me is why Oidaematophorus poulini appears in this list. It is clearly marked as a synonym for Hellinsia poulini (along with the 2005 versions of the same names). I thought the diff view only showed the accepted names. Otherwise, all good.

mdoering commented 2 months ago

The diff tool simply shows alphabetically sorted names. You can decide whether you want synonyms included, authorships and direct parents being shown. The default in that view does include synonyms.

dhobern commented 2 months ago

Gelechiidae updated again today.

yroskov commented 1 month ago

PREVIEW release started 2024-05-08, 3:41 pm (server time) Finished as COL24.5, id 295802, 2024-05-08, 5:10 pm Deployed to the preview website 2024-05-08

yroskov commented 1 month ago

PREVIEW release started 2024-05-09, 7:26 pm (server time) Finished as COL24.5, id 295965, 2024-05-09, 8:59 pm Deployed to the preview website 2024-05-09

yroskov commented 1 month ago

@dhobern, I'm going to sync all the updated Lepidoptera checklists for the May release today (tomorrow). It does not require any action on your part. I'll check the import dates in CLB. Further updates will be added to the Annual Checklist in June (June 7 is the deadline for the latest updates).

yroskov commented 1 month ago

PREVIEW release started 2024-05-15, 5:11 pm (server time) Finished as COL24.5, id 296093, 2024-05-15, 6:53 pm Deployed to the preview website 2024-05-15

CHECKS for @mdoering attention (how we can fix these?):

mdoering commented 1 month ago

@yroskov there is no point in doing (preview) releases while we still have the ITIS issue

yroskov commented 1 month ago

there is no point...

You said in Slack, "You can continue, please just dont use any sync all methods or manually sync merge sectors". I continued with the updates and need to see the results in the preview. What's the point of continuing if we still have an ITIS problem?

I have now suspended work on the May edition until I get the green light from you.

mdoering commented 1 month ago

It doesn't hurt to do a release if you need it. But obviously ITIS is wrong now, so no release at this stage can be a real one. Otherwise please continue your work as long as you do not sync all.

yroskov commented 1 month ago

I have deleted the 5 ITIS merge sectors which should have removed all linked data in the project. @yroskov could you do a brief check if you spot sth unusual? https://github.com/CatalogueOfLife/testing/issues/8#issuecomment-2114946648

COL project 3: checks of 2024-05-16:

Plantae, Multiple providers: 6 (7) unexpected contributors (e.g. GREEN, IUCN, etc.): image

Animalia, Multiple providers: many unexpected contributors (e.g. Afromoths, ArthropodsPT, etc.) image

Archaea & Bacteria: "2214" image

mdoering commented 1 month ago

That only shows in the project, right? Not in previews or releases in CLB? @thomasstjerne sth we might have to look into

yroskov commented 1 month ago

I checked it in the project only for now.

I launched a new preview, to see what is there.

yroskov commented 1 month ago

PREVIEW release started 2024-05-16, 2:10 pm (server time) Finished as COL24.5, id 296266, 2024-05-16, 3:40 pm Deployed to the preview website 2024-05-16

CHECKS 2024-05-16

mdoering commented 1 month ago

Weird. For some reason a single merge sector Miridae made it into the release: https://www.checklistbank.org/dataset/296266/sector?limit=100&mode=merge&offset=0

The sector was changed 12h after it was created. I suspect it changed its mode and it was originally accidently created as an attach sector. When creating attaches the root name is immediately copied to the project so it shows up in the assembly tree. This cause the sector to stay in the release. The family Miridae is still there, I will delete the sector now. @camiplata @DianRHR please recreate the sector as a merge one as needed!

I made sector modes immutable from now on. Trying to change them will raise an error.

mdoering commented 1 month ago

Works for me:

image image
yroskov commented 1 month ago

Hmm. Strange. Neither of 3 browsers displays logos in the PREVIEW on my machine (but www.catalogueoflife.org is OK).

FireFox:

image

image

Chrome:

image

MS Edge:

image

(I'll ask @gdower to have a look what might be wrong with my machine)

mdoering commented 1 month ago

The images are on a private URL that needs authentication. Are you logged into CLB right now? It's nothing to worry about for the public release, but still worth looking at

yroskov commented 1 month ago

PREVIEW release started 2024-05-17, 7:54 pm (server time) Finished as COL24.5, id 296445, 2024-05-17 Deployed to the preview website 2024-05-20

CHECKS 2024-05-20:

yroskov commented 1 month ago

PREVIEW release started 2024-05-20, 2:02 pm (server time) Finished as COL24.5, id 296489, 2024-05-20, 3:32 pm Deployed to the preview website 2024-05-20

Checked 2024-05-20

yroskov commented 1 month ago

PREVIEW release started 2024-05-20, 4:32 pm (server time) Finished as COL24.5, id 296511, 2024-05-20, 5:59 pm Deployed to the preview website 2024-05-20

Checked 2024-05-20

for example:

kingdom Animalia > family Katharellidae Czaker, 1994

image

kingdom Chromista > family Chloromonadaceae

kingdom Chromista > phylum Bigyra > class Labyrinthulea > families Acanthoniidae, Aulacanthidae, Stethopiliidae

image

kingdom Fungi > families Amoebidiaceae, Lagenidiaceae

image

etc.

yroskov commented 1 month ago

For attention of @olafbanki & @mdoering: COL24.5, id 296511, 2024-05-20 is completed now as CoL of May.

It looks like some unexpected taxa have appear in the Tree (they were absent in the CoL of April). These taxa have no credits to the Source Dataset and have no children species. Perhaps, this is a result of sync of ITIS "merged sectors" or other Extended Catalogue activities. Good thing, I don't see unexpected changes in species statistics.

mdoering commented 1 month ago

That may likely me the cause yes. I wasn't able to spot all of them. So please feel free to manage these unsourced higher names and delete them if desired

mdoering commented 1 month ago

I have looked into the dates when a taxon from the management classification, i.e. all taxa without a sector, were created. Aggregated by day this looks like this:

   created  | count 
------------+-------
 2019-11-20 |  1707
 2020-01-06 |     1
 2020-01-07 |     1
 2020-04-24 |   308
 2020-07-18 |    16
 2020-07-19 |     8
 2020-07-27 |     2
 2020-08-10 |   224
 2020-08-12 |   474
 2020-08-14 |     6
 2020-09-04 |     1
 2020-09-15 |     1
 2020-10-02 |     1
 2021-01-19 |    12
 2021-03-10 |     1
 2021-03-19 |     2
 2021-07-01 |     1
 2021-10-11 |    57
 2022-02-10 |     1
 2022-03-09 |     9
 2022-03-30 |     1
 2022-08-16 |     2
 2022-10-27 |   567
 2023-01-09 |     4
 2023-03-17 |     1
 2023-04-12 |    18
 2023-06-03 |     1
 2024-02-06 |    24
 2024-02-15 |   288
 2024-03-25 |     1
 2024-03-28 |     1
 2024-04-01 |     2
 2024-05-14 |  4446

You can nicely see that more than half of the taxa were created on May 14th when all of ITIS was synced. From these 579 were families, 3867 genera.

Comparing these numbers with the last April release it is exactly the same, but the 4446 from May are missing and instead of 288 from Feb 15th there were still 293. So some 5 names have been deleted:

 C5SLY | FAMILY      | Naibiidae        | ACCEPTED
 C8BYS | FAMILY      | Sinojuraphididae | ACCEPTED
 C72ZM | FAMILY      | Dracaphididae    | ACCEPTED
 C7RWG | SUPERFAMILY | Naibioidea       | ACCEPTED
 C2KG6 | INFRAORDER  | Naibiomorpha     | ACCEPTED

I am going to remove all May 14th taxa again as they are all empty with no species.

mdoering commented 1 month ago

All removed. Reindexing the project to be safe

yroskov commented 1 month ago

@mdoering, thank you!

All of it will go in June release. May should go as id 296511 of 2024-05-20, because I already started assembly of June yesterday, and the process is far from the end.

olafbanki commented 1 month ago

@yroskov if I understand may 2024-05-20 could be released as May edition?

yroskov commented 1 month ago

@olafbanki, yes

olafbanki commented 1 month ago

@yroskov thanks; @mdoering I have run the checks and committed a blog post. Can you publish the May edition?

mdoering commented 1 month ago

published

yroskov commented 1 month ago

Thank you!