Open yroskov opened 2 years ago
Fields in the export file:
Taxon rank | family Name | Poduridae Authorship | Latreille, 1804 Parent | Poduroidea Taxonomic status | ACCEPTED Actual taxon | Full name | Poduridae Web page switch | http://www.collembola.org/taxa/podurida.htm URL | http://www.collembola.org/taxa/collembo.htm Full line text | Familia Poduridae Latreille, 1804, i.s. AuthorshipInfo |
Additional fields in 2020 version generated by Anton (i.e. these fields absent in crowler's output):
WrongSpelling (4,987 names with portion (sic) from the field Name. Examples: Friesinae(sic), Triæna(sic), Polycanthella(sic), Achorutini(sic), Achorutoïdes(sic), Fresea(sic), Frisea(sic), Freisia(sic), Friese(sic). The field Actual taxon contains correct spelling of names)
Undescribed (1,163 names with values in the field Name like these: sp., sp.1, sp.2, sp.3, sp.nov., sp.USA.1, claviseta ssp. emucronata, fagei ssp. Colorata(sic), australasia(sic) ssp. schoetti(sic), lanuginosus-cyaneus-group
Intraspecific (generated from fields Taxon rank & Name)
subgenus -"-
genus -"-
subfamily -"-
family -"-
superfamily -"-
order -"-
Geoff's decision on how to proceed with absence of additional fields:
Authorship field and CoL name statuses.
There are accepted names and synonyms with authorships containing portion "sensu".
Synonyms with sensu in authorship field should go in CoL with status Misapplied Name. (I am not able to apply this status in CLB because all comments are taken off from authorstrings in CLB - for attention of @mdoering ).
Accepted names with sensu in authorship: As I can see, there are no proper species names with sensu (examples, Pseudachorutes sp.2 Janssens F sensu Denis, 1924; Pseudachorutes sp.3 Janssens F sensu Delamare Deboutteville, 1953; Pseudachorutes sp.4 Janssens F sensu Palacios-Vargas & Gómez-Anaya, 1995; Pseudachorutes sp.7 Janssens F sensu Christiansen & Bellinger, 1992)
Accepted names of other ranks (order, superfamily, subfamily) with sensu should keep their status as Accepted names in CoL.
Editorial decision for crawler in authorstrings (2022-05-11):
Let's strip off additional text after colon (in such names as order Entomobryomorpha Börner, 1913 : 319, sensu Soto-Adames FN et al., 2008) and eliminate the combination authorship (in such names as Friesea flava (Salmon, JT, 1949 : 13) Massoud, Z, 1967
2021: Option 1. Exclude (?) 33 accepted records with genus Neanuratobeassignedtopropergenus Keep these names, import them in CoL+ and modify through Complex Decision: Option 2, used in ac21: bring back these species names in genus Neanura and mark them as Provisionally accepted in CoL, also add a comment: Should be re-assigned to a proper genus.
GenusnovnotDicranocentrus PseudachorutinaeIncertaeSedis (GO: genus name removed by crawler) GenusnovnotDicranocentrus (GO: genus name removed by crawler)
Final data imported on PROD 2022-05-11: https://www.checklistbank.org/dataset/2130/classification
Collembola.org of 2022-05-05 (date of harvest with new Geoff's crawler) https://www.checklistbank.org/dataset/2130/classification
[x] Imported: 8,792 spp (vs 8,555 in ac21)
[x] Metadata: OK (ver May 2022, 2022-05-05)
[x] Classification: OK
[x] Sector: re-established, class Collembola
ISSUES assessed 2022-05-05
TASKS 2022-05-12
Resolved 2022-05-12
Synced 2022-05-12
Collembola.org of Apr 2023 / 2023-04-24 crawled & imported 2023-04-24
[x] Imported: 8,845 spp (vs 8,792 spp in May 2022 / 2022-05-05)
[x] Metadata: OK
[x] Classification: OK
Subfamily Troglopedetinae (family Paronellidae) is empty
[x] Sector: OK
[x] There are 3 accepted genera IncertaeSedis = blocked https://www.checklistbank.org/dataset/2130/names?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=50&offset=0&q=IncertaeSedis&sortBy=taxonomic, for example:
class: Collembola Lubbock J, 1870 > order: Entomobryomorpha Börner, 1913 > superfamily: Entomobryoidea Womersley, 1934 > family: Entomobryidae Schäffer, 1896 > subfamily: Willowsiinae Yoshii R & Suhardjono YR, 1989 > genus: IncertaeSedis Zhang, F, Chen, J-X & Deharveng, L, 2011
ISSUES assessed 2023-04-27
TASKS
Resolved 2023-04-27:
Synced 2023-04-27
Anton> "Frans is happy with our (your) activities and happy to share the checklist as long as he doesn't need to do extra work :) To my knowledge he is updating the checklist regularly (~once a month). If I'm correct, Geoff now have a crawling script and can mine the checklist again for the next CoL version. Or do you need anything from my side to make it happen?
The Crawler received and data harvested 2022-04-20.