CatalogueOfLife / testing

Editorial tests and discussion to prepare for COL releases
2 stars 0 forks source link

ITIS (id 2144): test report #8

Open yroskov opened 3 years ago

yroskov commented 3 years ago

https://www.checklistbank.org/dataset/2144/

Source of global sectors:

file ITIS_GSDs+Updates_forCoL_2020-03-03.xlsx

From: Nicolson, David Sent: Tuesday, March 3, 2020 23:01 To: Roskov, Yury Cc: Orrell, Thomas Subject: Initial list of ITIS GSDs for addition (or consideration) to CoL

Yuri, OK, here is my first pass trying to detect ITIS GSDs that should (or could) be added or updated in CoL. It includes GSDs we added or updated in ITIS since the last time CoL was updated for ITIS (mid-2017), as well as a few cases where ITIS loaded a GSD that was not noted to you previously. I left out groups where CoL already has a solid/active source, assuming the source seemed to actually be providing a reasonably complete GSD (vs. an "aspirational" GSD that is not very close to complete).

They are sorted according to their placement in ITIS now, via a hierarchy column. Those with yellow question marks may or may not be used in CoL; a few already have a source for CoL, but I suggest at least considering switching to ITIS due to various issues.

I have included a few things that we will shortly have loaded into ITIS, and a few that we are actively working on now (for inclusion in ITIS later this year, likely before the ITIS CoLdp export is ready).

If we realize we missed anything I will let you know.

Thanks, Dave

yroskov commented 3 years ago

file ITIS_GSDs+Updates_forCoL_2020-03-03_yr_log

GSDs added or updated (or about/planned to be added/updated) in ITIS that are available, potentially for CoL use: CoL-yr comment yr2 comment yr3 comment
Hierarchy      
Deuterostomia : Chordata : Vertebrata : Gnathostomata : Tetrapoda : Amphibia 2020-08-07 old sector deleted, new attached, synced    
Deuterostomia : Chordata : Vertebrata : Gnathostomata : Tetrapoda : Aves 2020-08-07    
Deuterostomia : Chordata : Vertebrata : Gnathostomata : Tetrapoda : Mammalia 2020-08-07    
Deuterostomia : Chordata : Vertebrata : Gnathostomata : Tetrapoda : Reptilia : Squamata : Serpentes ReptileDB    
Deuterostomia : Chordata : Vertebrata : Gnathostomata : Tetrapoda : Reptilia : Squamata… ReptileDB    
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : CURRENT/NEW HIERARCHY TO FAMILY 2020-08-11 I am going update classification in Arachnida with latest ITIS. To do that, I have Deleted Subtree class Arachnida, then, assembled subclass Arachnida from ITIS, and synced it. Third step should be Delete Sector Arachnida. Expected behavior: full classification in subclass Arachnida will stay in the tree for further assembly of GSD sectors. However, deletion is failed             2020-08-13 Markus fixed problem "manually", for one time only. As result, Arachnida has classification withour sectors as a start poinr for re-assembly. Attention: some taxa have no genera (in source IT IS data - the same). 2020-08-13 ITIS Arachnida classification updated. 10 GSDs re-attached in Arachnida: BdelloideaBase, FADA Halacaridae, OlogamasidBase, PhytoseiidBase, RhodacaridBase, SpmWeb, TenuipalpidBase, The Scorpion Files, TicksBase, WSC 2020-08-13 Doing assembly of ITIS sectors in Arachnida, I have drag&drop orders Amblypygi & Palpigradi from ITIS to assembly tree.
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Amblypygi +Palpigradi +Ricinulei +Schizomida +Solifugae +Uropygi 2020-08-10 old sector deleted, new attached, synced 2020-08-13; re-assembled (replace), re-synced  
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Pseudoscorpionida 2020-08-10 old sector deleted, new attached, synced 2020-08-13; re-assembled, re-synced Expected behavior: both orders will appear in Arachnida - superorder Not assigned - orders Amblypygi & Palpigradi.
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Acariformes : Sarcoptiformes : Endeostigmata : Alicorhagioidea +Alycoidea +Nematalycoidea +Oehserchestoidea +Terpnacaroidea   2020-08-13; re-assembled (replace), re-synced  
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Acariformes : Sarcoptiformes : Oribatida   2020-08-13; re-assembled (replace), re-synced However, result was unexpected: Arachnida - superorder Not assigned - order Araneae - order Amblypygi
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Acariformes : Trombidiformes : Sphaerolichida : Lordalycoidea (Lordalycidae) +Sphaerolichoidea (Sphaerolichidae)   2020-08-13; re-assembled (replace), re-synced and
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Opilioacariformes : Opilioacarida 2020-08-10 old sector deleted, new attached, synced 2020-08-13; re-assembled (replace), re-synced Arachnida - superorder Not assigned - order Opiliones - order Palpigradi
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Parasitiformes : Holothyrida 2020-08-10 new sector attached, synced 2020-08-13; re-assembled (replace), re-synced  
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Parasitiformes : Ixodida : Ixodides : Argasidae + Nuttalliellidae TickBase    
Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Parasitiformes : Ixodida : Ixodides : Ixodidae TickBase    
Protostomia : Ecdysozoa : Arthropoda : Crustacea : Branchiopoda : [Phyllopoda] : Notostraca 2020-08-10 new sector attached, synced    
Protostomia : Ecdysozoa : Arthropoda : Crustacea : Branchiopoda : [Sarsostraca] : Anostraca 2020-08-10 old sector deleted, new attached, synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Collembola : Collembola : Entomobryomorpha : Actaletoidea : Actaletidae All from Collembola.org    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Collembola : Collembola : Entomobryomorpha : Isotomoidea : Isotomidae : Anurophorinae      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Collembola : Collembola : Poduromorpha : Hypogastruroidea : Hypogastruridae      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Holometabola : Coleoptera : Polyphaga : Cucujiformia : Chrysomeloidea : Chrysomelidae : Cassidinae 2020-08-10 new sector (family  Chrysomelidae) attached, synced. 2020-08-18 Deleted sector Chrysomelidae, attachment of subfamily failed; 2020-08-18 subfamily assigned & synced (correct GSD 2144) Merge with overlapped spp needed. Subfam, vs fam. As shortcut decision: whole family taken  
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Holometabola : Coleoptera : Polyphaga : Cucujiformia : Chrysomeloidea : Megalopodidae 2020-08-10 new sector attached, synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Holometabola : Coleoptera : Polyphaga : Elateriformia : Byrrhoidea : Elmidae +Protelmidae 2020-08-10 new sector attached, synced    
YR: Byrrhoidea : Limnichidae 2020-08-10old sector deleted, new attached, synced Family Limnichidae was indicated as IT IS global in ac19. I did re-assemble it.  
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Holometabola : Hymenoptera : Apocrita : Aculeata : Apoidea (BEES) : Andrenidae +Apidae +Colletidae +Halictidae +Megachilidae +Melittidae +Stenotritidae Apoidea 2020-08-03 attached, synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Holometabola : Hymenoptera : Apocrita : Aculeata : Apoidea (sphecoid wasps) : Crabronidae +Ampulicidae +Heterogynaidae +Sphecidae Temporarily replaced HyMIS Crabronidae - see & compare results    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Cimicomorpha : Cimicoidea : Curaliidae 1 sp; 2020-08-06 attached, synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Cimicomorpha : Joppeicoidea 1 sp; 2020-08-06 attached, synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Cimicomorpha : Miroidea : Thaumastocoridae 2020-08-06 attached; Sync failed; children in Miroidea incomplete 2020-08-07 bug fixed; synced  
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Cimicomorpha : Reduvoidea : Reduviidae 2020-08-06 attached; Sync failed; children in Miroidea incomplete 2020-08-07 bug fixed; synced  
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Cimicomorpha : Velocipedoidea : Velocipedidae 2020-08-06 created superfamily Velocipedoidea; attached; Sync failed    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Dipsocoromorpha : Dipsocoroidea 2020-08-07 drag&drop infraoder Dipsocoromorpha; (before deleted old superfamily) attached; Sync failed    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Enicocephalomorpha : Enicocephaloidea 2020-08-07 drag&drop infraoder Enicocephalomorpha; (before deleted old superfamily) attached; Sync failed    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Gerromorpha : Gerroidea 2020-08-07 drag&drop infraoder  Gerromorpha; (before deleted old superfamilies from IT IS-Global) attached; Sync failed    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Gerromorpha : Hebroidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Gerromorpha : Hydrometroidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Gerromorpha : Mesovelioidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Leptopodomorpha : Leptopodoidea 2020-08-07 drag&drop infraoder  Leptopodomorpha; (before deleted old superfamilies from IT IS-Global) attached; Synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Leptopodomorpha : Saldoidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Nepomorpha : Corixoidea 2020-08-07 drag&drop infraoder  Nepomorpha; (before deleted old superfamilies from IT IS-Global) attached; Synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Nepomorpha : Naucoroidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Nepomorpha : Nepoidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Nepomorpha : Notonectoidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Nepomorpha : Ochteroidea      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Pentatomomorpha : Idiostoloidea : Henicoridae      
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Pentatomomorpha : Idiostoloidea : Idiostolidae 2020-08-07 drag&drop suprfamily Idiostoloidea; (before deleted old families from IT IS-Global) ; Synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Pentatomomorpha : Pentatomoidea : Dinidoridae 2020-08-07 synced    
Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Pentatomomorpha : Pyrrhocoroidea : Largidae 2020-08-07 synced    
Protostomia : Ecdysozoa : Arthropoda : Myriapoda : Symphyla from WoRMS Millibase    
yroskov commented 3 years ago

file ITIS_GSDs+Updates_forCoL_2020-03-03_yr_log ITIS_ac19

  Archaea          
  Bacteria          
2020-08-14 Protozoa - Apicomplexa - Conoidasida - Eucoccidiorida - Cryptosporidiidae - Cryptosporidium
2020-08-14 assembled & synced Chromista - Ochrophyta - Bacillariophyceae - Chaetocerotales - Chaetocerotaceae
2020-08-14 assembled & synced Chromista - Ochrophyta - Bacillariophyceae - Naviculales - Naviculaceae - Navicula
  Plantae - Tracheophyta - Magnoliopsida - Brassicales - Koeberliniaceae, Limnanthaceae
  Plantae - Tracheophyta - Magnoliopsida - Caryophyllales - Cactaceae, Nepenthaceae, Simmondsiaceae
  Plantae - Tracheophyta - Magnoliopsida - Crossosomatales - Crossosomataceae
  Plantae - Tracheophyta - Magnoliopsida - Cucurbitales - Datiscaceae
  Plantae - Tracheophyta - Magnoliopsida - Huerteales - Gerrardinaceae
  Plantae - Tracheophyta - Magnoliopsida - Lamiales - Gesneriaceae
  Plantae - Tracheophyta - Magnoliopsida - Malpighiales - Lophopyxidaceae
  Plantae - Tracheophyta - Magnoliopsida - Proteales - Nelumbonaceae
  Plantae - Tracheophyta - Magnoliopsida - Saxifragales - Penthoraceae
  Plantae - Tracheophyta - Magnoliopsida - Solanales - Montiniaceae, Sphenocleaceae
2020-08-14 assemled in phylum Kamptozoa (see below Jul 2023): class Entoprocta & Cycliophora; synced Animalia - Acanthocephala, Entoprocta, Hemichordata, Micrognathozoa, Cycliophora, Onychophora, Sipuncula, Tardigrada
2020-08-14 assembled & synced Animalia - Annelida - Clitellata - Branchiobdellida  
2020-08-10 Animalia - Arthropoda - Branchiopoda - Anostraca  
WoRMS Amphipoda Animalia - Arthropoda - Malacostraca - Amphipoda - Crangonyctidae - Stygobromus
2020-08-14 assembled & synced Animalia - Arthropoda - Malacostraca - Mysida  
WoRMS Copepoda Animalia - Arthropoda - Maxillopoda - Calanoida - Aetideidae
  Animalia - Arthropoda - Arachnida - Amblypygi, Opilioacarida, Palpigradi, Pseudoscorpiones, Ricinulei, Schizomida, Solifugae, Uropygi
  Animalia - Arthropoda - Arachnida - Sarcoptiformes - (suborder Oribatida)
2020-08-14 assembled & synced Animalia - Arthropoda - Entognatha - Protura  
  Animalia - Arthropoda - Insecta - Hemiptera - Hebroidea - Macroveliidae, Paraphrynoveliidae    
  Animalia - Arthropoda - Insecta - Hemiptera - Hydrometroidea - Hydrometridae    
  Animalia - Arthropoda - Insecta - Hemiptera - Gerroidea - Hermatobatidae, Veliidae    
  Animalia - Arthropoda - Insecta - Hemiptera - Leptopodoidea, Saldoidea    
  Animalia - Arthropoda - Insecta - Hemiptera - Mesovelioidea - Mesoveliidae    
  Animalia - Arthropoda - Insecta - Hemiptera - Naucoroidea - Aphelocheiridae, Potamocoridae    
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Hymenoptera - Ceraphronoidea, Evanioidea, Platygastroidea
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Hymenoptera - Cynipoidea - Ibaliidae
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Hymenoptera - Vespoidea - Formicidae
2020-08-25 assembled & synced Animalia - Arthropoda - Insecta - Coleoptera - (suborder Archostemata)
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Coleoptera - Bostrichidae, Dytiscidae, Histeridae, Nosodendridae
StaphBase Animalia - Arthropoda - Insecta - Coleoptera - Hydraenidae
  Animalia - Arthropoda - Insecta - Coleoptera - Byrrhoidea - Limnichidae
see left Animalia - Arthropoda - Insecta - Coleoptera - Chrysomeloidea - Chrysomelidae - (subfamily Cassidinae)
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Coleoptera - Elateroidea - Lampyridae, Phengodidae
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Coleoptera - Cucujoidea - Cucujidae - Pediacus
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Mecoptera  
2020-08-14 assembled & synced Animalia - Arthropoda - Insecta - Trichoptera  
  Animalia - Chordata - Amphibia, Aves, Mammalia  
             
  Animalia ph cl or sfa fa
2020-08-25 assembled & synced       Coleoptera Not assigned Crowsoniellidae
            Cupedidae
            Jurodidae
            Micromalthidae
            Ommatidae
             
2020-08-25 FIXED: deleted as a sector Animalia - Arthropoda - Malacostraca - Decapoda - Thalassinoidea
             
2021-01-05 assembled Animalia - Arthropoda - Arachnida Opiliones Superfamily Samooidea • 210 living spp•ITIS Global
  Animalia - Arthropoda - Arachnida Opiliones Superfamily Travunioidea • 68 living spp•ITIS Global
  Animalia - Arthropoda - Arachnida Opiliones Superfamily Triaenonychoidea • 437 living spp•ITIS Global
  Animalia - Arthropoda - Arachnida Opiliones Superfamily Zalmoxoidea • 270 living spp•ITIS Global
yroskov commented 3 years ago

@gdower set up a timer for automatic import.

Standard day:
time:

mdoering commented 3 years ago

we have a built in timer we should use. After stable ids are completed we should try it with all datasets on dev for a while and activate it on prod if no issues pop up. We did use it in the early days already

yroskov commented 3 years ago

ITIS of 2020-12-21 is synced on 2021-01-08

yroskov commented 3 years ago
yroskov commented 3 years ago

@gdower fixed broken sectors 2021-02-01

ITIS of 2020-12-21 is re-synced on 2021-02-01

yroskov commented 3 years ago

ITIS of 2021-01-26 was imported (no broken sectors) and synced on 2021-02-03.

DaveNicolson commented 3 years ago

@yroskov , some newer ITIS GSDs that don't appear to overlap with other sources:

SMALL BUT IMPORTANT FAMILY: Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Parasitiformes : Mesostigmata : Monogynaspida : Gamasina : Dermanyssoidea : Varroidae

Additional newer ITIS GSDs that appear to be "GAPS" in COL: Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Cimicomorpha : Cimicoidea : Cimicidae

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Cimicomorpha : Cimicoidea : Polyctenidae

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Paraneoptera : Hemiptera : Heteroptera : Pentatomomorpha : Pentatomoidea : Acanthosomatidae

yroskov commented 3 years ago

@DaveNicolson thank you!

Families Varroidae, Cimicidae, Polyctenidae & Acanthosomatidae have been assembled in CoL and synced from ITIS of 2021-01-26.

DaveNicolson commented 3 years ago

@yroskov the February ITIS load was just put online. There are two new GSDs that appear to be gaps in COL now. They are: Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Crustacea : Malacostraca : Eumalacostraca : Eucarida : Decapoda : Pleocyemata : Infraorder Astacidea [Superfamilies Astacoidea, Parastacoidea, Enoplometopoidea and Nephropoidea] Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Crustacea : Malacostraca : Eumalacostraca : Eucarida : Decapoda : Pleocyemata : Infraorder Glypheidea [Superfamily Glypheoidea]

The latter was split out of the former. I see COL is not using any ranks between Decapoda and superfamilies in it. I've noted the GSD superfamilies for each of the two infraorder GSDs so you can handle them as you wish.

gdower commented 3 years ago

The import of the ITIS 2021-02-26 release finished: https://data.catalogueoflife.org/dataset/2144/imports

yroskov commented 3 years ago

@DaveNicolson thank you!

Infraorders Astacidea and Glypheidea have been assembled in CoL as direct children of Decapoda and synced from ITIS of 2021-02-26.

(Hmm, it's a mess in the Tree: now all superfamilies from WoRMS Brachyura gone under infraorder Not Assigned).

yroskov commented 3 years ago

ITIS of 2021-02-26 (id 2144) is synced on 2021-03-02.

DaveNicolson commented 3 years ago

March 30 load for ITIS is complete, and the full exports are available. A new smaller GSD that is a gap now in COL is Order Lophogastrida (ITIS TSN 89808), which is in the ITIS hierarchy here: Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Crustacea : Malacostraca : Eumalacostraca : Peracarida : Lophogastrida The other updated groups are already marked as ITIS sectors in COL, so should update in your next sync of the ITIS data. Thank you!

yroskov commented 3 years ago

Thank you Dave! We already processed a new version. Without Lophogastrida yet. I'll add it as a new sector right now in the clearinghouse for a next release.

yroskov commented 3 years ago

Order Lophogastrida established as a sector and synced 2021-04-02 (after launch of preview release).

yroskov commented 3 years ago

ITIS of 2021-04-27 (id 2144) is imported in CoL & synced on 2021-04-30. 7th update

DaveNicolson commented 3 years ago

Two (or three, if you prefer) new (for ITIS) GSDs that may or may not be wanted by COL (none of them appears in COL now, I believe):

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Crustacea : Malacostraca : Eumalacostraca : Eucarida : Decapoda : Pleocyemata : Caridea : Alpheoidea : Alpheidae https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=96600#null

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Crustacea : Malacostraca : Eumalacostraca : Eucarida : Decapoda : Pleocyemata : Achelata & Polychelida [former "Palinura", now split] https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=1147564#null https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=1147563#null

yroskov commented 3 years ago

@DaveNicolson thank you!

Family Alpheidae is assembled now as a direct child of superfamily Alpheoidea. Infraorders Achelata & Polychelida are assembled as children of suborder Pleocyemata. All 3 new sectors synced and will appear in CoL of May.

DaveNicolson commented 3 years ago

New version of ITIS is available, dated 26 May 2021. We have some modest new GSDs that appear to be gaps in COL:

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Acariformes : Trombidiformes : Prostigmata : Anystina : Erythraeoidea : Smarididae (family) (we are working on the other family of the superfamily, not ready yet though) https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=1118129#null

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Acariformes : Trombidiformes : Prostigmata : Anystina : Calyptostomatoidea (superfamily) https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=895634#null

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Crustacea : Malacostraca : Eumalacostraca : Eucarida : Decapoda : Pleocyemata : Stenopodidea (infraorder) https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=97294#null

gdower commented 3 years ago

Thanks @DaveNicolson! I imported the new version, although it won't be assembled into the Catalogue until Yuri returns next week: https://data.catalogueoflife.org/dataset/2144/imports

yroskov commented 3 years ago

3 new sectors established as:

yroskov commented 3 years ago

ITIS of 2021-05-26 (id 2144, all sectors) is synced on 2021-06-07. 9th update

DaveNicolson commented 3 years ago

The June ITIS load is completed, and you're welcome to import it for your use.

As already discussed (in a Spring 2021 Taxonomic Group meeting), COL should consider whether to adopt the new ITIS GSD for the mosquito family, Culicidae, under Order Diptera in COL, in place of Systema Dipterorum's data (if Thomas Pape still agrees, it is really his call, the numbers below are not so different after all).

If you want it, it is in ITIS here: Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Hexapoda : Insecta : Pterygota : Neoptera : Holometabola : Diptera : Nematocera : Culicomorpha : Culicidae

COL now shows these stats for the family, per Systema Dipterorum: Subfamily 3 Tribe 11 Subtribe 1 Genus 57 Subgenus 216 Species 3588 Subspecies 4

ITIS now shows these stats for the family: Subfamily 2 Tribe 11 Genus 41 Subgenus 188 Species 3585 Subspecies 219

The update was based primarily on the 2021 "Mosquitoes of the World" volumes (Wilkerson et al.), but also used the data from Harbach's "Mosquito Taxonomic Inventory" website. We tried to include the alternative combinations in use (in synonymy, where appropriate), especially for known disease vectors, and the full data set includes over 5400 scientific names (all ranks, all usages), which should help link names with various conflicting taxonomic sources from the last ~45 years.

yroskov commented 3 years ago

ITIS of 2021-06-29 imported in the checklist bank 2021-07-01. 10th update.

I have have set up a patch, repeating metadata from the portal: https://www.catalogueoflife.org/data/dataset/2144

mdoering commented 3 years ago

The new diff UI even works for the larger ITIS changes: https://data.catalogueoflife.org/dataset/2144/diff?attempts=36..37

DaveNicolson commented 2 years ago

August load completed at ITIS. New GSDs:

Small mite order Holothyrida (30 species, plus one more linked to the order as 'incertae sedis' [uncertain placement], as it doesn't belong in the order but it isn't clear where it DOES belong...) Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Parasitiformes : Holothyrida

The mite family Spinturnicidae, which is part of superfamily Dermanyssoidea (which is already pulled from ITIS I think, so should be included automatically... 113 species) Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Parasitiformes : Mesostigmata : Monogynaspida : Gamasina : Dermanyssoidea : Spinturnicidae

We also included an updated GSD (should be included automatically?) of the beetle family Lampyridae ('fireflies'), and some other tweaks.

It looks like the prior monthly update (end-of-July) did not get synced into COL, as we updated GSDs of Dytiscidae (Coleoptera) and Vespertilionidae (bats), but I'm not seeing the new taxa in COL. Since there were no NEW GSDs that month I didn't post about it here, assuming that the updates would happen automatically. We update ITIS at the end of every month. Do I need to post that an update was done even if there were no GSDs to add?

Thanks. -Dave

yroskov commented 2 years ago

Thank you Dave! ITIS of August 27th is already in the checklistbank (https://github.com/CatalogueOfLife/testing/issues/151#issuecomment-910785064).

I'll establish new sectors tomorrow and will have a look on what was happened to July's version (it supposed to be included in August edition with an update of all sectors).

yroskov commented 2 years ago

ITIS of 2021-08-27.

[x] Metadata fixed: https://data.catalogueoflife.org/dataset/2144/about @DaveNicolson, please let us know, if you wish to change ITIS metadata in the checklistbank & CoL (even better, if you can change them in the checklistbank be yourself)

[x] Sectors: Order Holothyrida is already established as ITIS sector in CoL. So, new data should be synced in CoL automatically. Family Spinturnicidae is now established as a sector in the superfamily Dermanyssoidea (Dermanyssoidea is not a sector yet; only family Varroidae is a sector. The classification above species is taken from ITIS. @DaveNicolson, please let us know, if you recommend to take entire superfamily Dermanyssoidea from ITIS as a [global] sector). Family Lampyridae is already established as ITIS sector in CoL. So, new data should be synced in CoL automatically. I have checked Dytiscidae and Vespertilionidae - both are established sectors, and should be updated automatically with every new ITIS release. If not, please, let us know. All 81 ITIS sectors (as presented by the software on 2021-09-03) are healthy (not broken): https://data.catalogueoflife.org/catalogue/3/sector?limit=100&offset=0&subjectDatasetKey=2144

ITIS of 2021-08-27 synced.

DaveNicolson commented 2 years ago

I wanted to touch base on an issue from the June 2 Taxonomy Group meeting... Notes from the agenda were updated in the meeting: "ITIS sectors now global, equal to or better than current sectors (Dave N.). Symphyla and Parapoda are easy choices for COL to adopt from ITIS. Ixodida is global with all families represented and more species than TickBase. Chilopoda is a possibility. Alessandro Minelli of Chilobase 2.0 has been asked to update, waiting for a full response. Depends upon capacity for export."

ChiloBase 2.0 now has names added up until 2020, so it looks like they are updating, and I suggest staying w/them as long as you can get an updated data set (fingers crossed!).

The others remain as they were. Here are the stats I'm seeing for them as of right now (same as in June):

SYMPHYLA: MilliBase stats page indicates 100 living species. ITIS (updated 2019) has 235.

PAUROPODA: MilliBase stats page indicates 510 living species. ITIS (updated 2020) has 995.

IXODIDA: TickBase in COL (from 2005!) indicates 867 living species. ITIS (updated 2019) has 953.

What remains to be done to either get the 2 or 3 above adopted from the ITIS GSDs, or to see that the current COL data are otherwise updated? I was under the impression that this had already been decided in the June TG meeting.

yroskov commented 2 years ago

@dhobern, could you please take the issues raised by Dave above. I guess, decision should be made by Taxonomy Group on a base of peer reviews by experts independent from ITIS, WoRMS (as a provider for Symphyla & Parapoda), TickBase, ChiloBase and CoL.

As soon as we get decision from TG, listed taxa will be replaced with ITIS data in CoL. (TG decision should be wrapped in a text, which can be also sent to retired data providers).

(@dhobern, shall we open special branch in Github for Taxonomy Group issues and move Dave's points there? CatalogueOfLife/testing branch is intended for test reports and exchanges inside each GSD or CoL edition).

DaveNicolson commented 2 years ago

I am confused. We raised this issue in the Taxonomy Group meeting already, and I was under the impression that this decision was already taken: https://docs.google.com/document/d/1qfMc9EYq8uZoGtZzkIoZvhwFV8eINKM18Db2VoOppao/edit?usp=sharing

We need TG to decide this twice?

yroskov commented 2 years ago

I did not see any peer reviews as a basis for the decision.

mdoering commented 2 years ago

I would suggest to keep all real data issues in the data repository and flag the ones relevant for TWG with the respective label. That's what it was made for. It will be confusing to have various repositories for data related issues.

yroskov commented 2 years ago

For me all these comments are spam in my test report. Please move such discussion to another branch.

DaveNicolson commented 2 years ago

I'm totally fine with taking this discussion to another "place," but I am not well-versed in github and labels... Some additional comments:

For context, in 6/2019, Millibase in COL for Paurpoda & Symphyla was: Class Pauropoda • 491 of 835 est. living spp (59%) • WoRMS MilliBase Class Symphyla • 100 of 197 est. living spp (51%) • WoRMS MilliBase

Looks like they added 19 accepted species under Pauropoda in early 2021. But otherwise those two "GSDs" seem to have been largely static in MilliBase for several years.

In Canberra (3/2020) I included the following in my presentation of new GSDs ITIS was offering (the noted soft+hard ticks make up "Ixodida" as a whole): "• Soft ticks (Argasidae & Nuttalliellidae) – CoL source is from 2005 • Hard Ticks (Ixodidae) – CoL source is from 2005 • Symphyla – CoL source is ChiloBase (although this is not Chilopoda, and it remains at 51% completeness for this group)" [NB: I should have written "MilliBase" instead of "ChiloBase"!]

In 2/2021 I re-iterated to Ed (then-Chair of TG) about the Ixodida and Symphyla, and noted that we also had a new GSD for Pauropoda to offer, as well. Ed suggested I raise this with the TG, which I did in June.

When considering existing COL GSDs that have remained largely unchanged for years, remaining between 50-60% complete (that is, clearly incomplete, and with limited apparent progress, as with Symphyla and Pauropoda) it seemed plain on its face, without a formal "peer review" outside of the TG meeting, that it would be beneficial to adopt those two, which is probably why the notes from the meeting indicate adding those was an easy choice. That said, I have no objection to the assignment of these to appropriate peers for review, as I recognize it would be beneficial to have a formal review statement in hand when discussing it with MilliBase folks.

As to TickBase (2005 GSD) vs. ITIS' (2019) tick GSD, I have no objection to some kind of peer review. TickBase was at least complete in 2005, so the choice would be between a complete & ~current dataset from ITIS vs. a complete-but-many-years-old dataset from TickBase's experts... unless they are willing to provide a new/updated data set (who will ask them?)...

As to Chilopoda (ChiloBase), I also have no objection to some kind of review, but the first choice should be to get updated data from ChiloBase, since they are clearly making additions and changes to the content, at least up to 2000. By all means, go with them. BUT IF it is not possible to get a new copy from ChiloBase, the fall-back position COULD be to use the ITIS data.

If peer reviews need to be done, then can we please figure out a way to at least try to get them done in a timely manner? I suppose it falls to @dhobern to work out (sorry Donald!)...

DaveNicolson commented 2 years ago

The data for the September ITIS load is available now (I can provide an FTP link if needed, since they are not yet downloadable from the ITIS website) even though it is not yet reflected on the ITIS website, and given the possibility of the govt shutdown you might want to grab it now (I have no idea how the website might be affected by a shutdown, just that website problems would be hard to address). It includes several new GSDs that are gaps in COL...

Animalia : Bilateria : Protostomia : Ecdysozoa : Arthropoda : Chelicerata : Euchelicerata : Arachnida : Parasitiformes : Mesostigmata : Monogynaspida : Gamasina : Superfamily Dermanyssoidea families: Dermanyssidae Rhinonyssidae Dasyponyssidae Entonyssidae Hystrichonyssidae Manitherionyssidae Raillietiidae Halarachnidae Spelaeorhynchidae

There are some updated GSDs in groups you already have, but those should be updated automatically.

Let me know if you need the download link (not shown on the ITIS site). Hopefully (!!) the site will be updated today, but as I said, the data are already available via FTP...

yroskov commented 2 years ago

ITIS of 2021-09-28

Thank you Dave!

yroskov commented 2 years ago

ISSUES of ITIS of 2021-09-28 assessed 2021-10-14 image

yroskov commented 2 years ago

TASKS of ITIS of 2021-09-28 image

Cryptopygus quadrioculatus Yoshii, 1995
Cryptopygus quadrioculatus (Rapoport, 1963)
Cryptopygus quadrioculatus Martynova, 1967

two present in CoL: Lasioglossum froggatti (Cockerell, 1905)
Lasioglossum froggatti (Cockerell, 1911)
Lasioglossum froggatti Walker, 1995

Megachile melanopyga Schrottky, 1908
Megachile melanopyga Cockerell, 1909
Megachile melanopyga Costa, 1863

Resolved 2021-10-14 image

yroskov commented 2 years ago

ITIS of 2021-09-28 synced 2021-10-14

DaveNicolson commented 2 years ago

Re "TASKS of ITIS of 2021-09-28", I don't know quite what you're asking me to do here.

For the first example of 3 homonyms, they are in Collembola. Wasn't COL getting that group from the world Collembola site? In any case, the 3 examples of Cryptopygus quadrioculatus, we already placed comments on these when they were added to ITIS explaining the situation (as we try to do for all homonyms), e.g., for Cryptopygus quadrioculatus (Rapoport, 1963) (TSN 723736), we added this comment: "Apparently a senior homonym of Cryptopygus quadrioculatus Martynova, 1967 and Cryptopygus quadrioculatus Yoshii, 1995"

Similar comments on the other names: "Nec Rapoport, 1963. Apparently an unresolved secondary junior homonym"

For the triple Lasioglossum example, they were also unresolved homonyms when ITIS updated those groups. Comments lay it out: "Lasioglossum froggatti Walker, 1995 and Lasioglossum froggatti (Cockerell, 1911) (originally described in Halictus) are unresolved junior secondary homonyms of Lasioglossum froggatti (Cockerell, 1905) (originally described in Parasphecodes)" "Lasioglossum froggatti (Cockerell, 1911) (originally described in Halictus) is an unresolved junior secondary homonym of Lasioglossum froggatti (Cockerell, 1905) (originally described in Parasphecodes), and in need of a replacement name" "Lasioglossum froggatti Walker, 1995 is an unresolved junior secondary homonym of Lasioglossum froggatti (Cockerell, 1905) (originally described in Parasphecodes), and in need of a replacement name" ... Hopefully those bee names will be rectified once we update the family (we are working through the bees)...

For the 3x Megachile melanopyga case, they also appeared to be unresolved homonyms when we last updated the megachilid bees (and homonymy comments were applied). We are now finalizing an update for that LARGE bee family and all three have been resolved now. The update should be available in ITIS in the November load (unless something new crops up that takes more time to address).

yroskov commented 2 years ago

Re "TASKS of ITIS of 2021-09-28", I don't know quite what you're asking me to do here.

Sorry. Nothing to do for you. It is just my log of actions with ITIS data in CoL.

yroskov commented 2 years ago

ITIS of 2921-10-28 imported 2021-11-02.

Dave, 2021-11-02:

That’s great, the data have been available from the FTP site I previously noted, although I’m not sure if they finished making them available via the website downloads page or not (there were technical issues, although the site shows all the new data). New GSDs are as follows: Mite family Macronyssidae (in Dermanyssoidea, 246 species) Beetle family Cybocephalidae (in Cucujoidea, 207 species)

TASKS on 2021-11-03 (no changes) image

yroskov commented 2 years ago

Checking data at CoLPreview of 2021-11-05:

Family Macronyssidae (new sector) is empty in CoL and empty in ITIS 2021-10-28 import: https://data.catalogueoflife.org/catalogue/3/assembly?assemblyTaxonKey=8a95c5eb-1fa8-48dd-9654-a6532962891b&datasetKey=2144&sourceTaxonKey=1118056 = FIXED after re-import (see below for details)

image

Original data in ITIS: https://itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=1118056#null

Few other families are empty as well.

@Geoff, could you pls help to understand what's happened?

DaveNicolson commented 2 years ago

I suspect that the ITIS export was downloaded before the issues I had mentioned were resolved. You'll probably have to download it again & use that version instead of the problematic one you got. Please from now on wait until I notify you that the new data are available (I'm happy to clarify if there is ever any doubt). -Dave

yroskov commented 2 years ago

@gdower, could you please do a new download/import?

@DaveNicolson, should we do it from the website or via ftp? (Previous download was done from the website).

DaveNicolson commented 2 years ago

I personally use the FTP site, although during the load process a new version that isn't final can appear there (this is a normal part of the process, if anything in the load needed manual tweaking the initial version is replaced by the final/tweaked one). At this stage (Nov. 8) the FTP directory has the final data, so go with that one. By the time I am prepared to alert COL to a new version of ITIS, the FTP directory will already have the final versions.

yroskov commented 2 years ago

Geoff: the code switched to use the ftp version