Open yroskov opened 3 years ago
Full name | #spp ac19 | #spp Import | Exported from https://sandcastle.taxonworks.org/ | Conclusion | Report |
---|---|---|---|---|---|
Aphid Species File | 5568 | 4680 | 2021-04-07 | (1) Lot of subspecies in the root (i.e. subspecies are placed outside infraorder Aphidomorpha), (2) There are many empty subfamilies in the family Aphididae, etc. | https://github.com/CatalogueOfLife/testing/issues/77 |
Chrysididae Species File | 197 | 197 | 2021-04-08 | Ready to be imported in CoL. | |
Cockroach Species File | 4649 | 4886 | 2021-04-08 | Ready to be imported in CoL. Superfamily NotAssigned should be excluded from the assembly. | |
Coleorrhyncha Species File | 66 | 99 | 2021-04-08 | Ready to be imported in CoL. Superfamily NotAssigned (child of Suborder Coleorrhyncha) should be taken in CoL. | |
Phasmida Species File | 3284 | 3202 | 2021-04-08 | There are 20 subspecies in the Tree root (i.e. outside order Phasmida) | https://github.com/CatalogueOfLife/testing/issues/84 |
Plecoptera Species File | 3938 | 3225 | 2021-04-07 | (1) Lot of species are in the Tree root (i.e. outside order Plecoptera), (2) There are 713 species less than in ac19. | https://github.com/CatalogueOfLife/testing/issues/78 |
Psocodea Species File | 11084 | 10980 | 2021-04-08 | There is a batch of Euplocania species and Unranked uninomials "Euplocania" in the Tree root (i.e. outside order Psocodea) | https://github.com/CatalogueOfLife/testing/issues/83 |
Zoraptera Species File | 52 | 64 | 2021-04-07 | Ready to be imported in CoL. |
The main issue in problematic SF exports is a batch of subspecies or species with parent taxa recognized by the clearinghouse as "bare names". All those orphan children appear in the Tree root outside the top taxon. Details are in Github reports.
Inviting @LocoDelAssembly to join this investigation
Batch 2
Full name | #spp ac19 | #spp Import | Exported from https://sandcastle.taxonworks.org | Conclusion | Report |
---|---|---|---|---|---|
Coreoidea Species File | 3119 | 3052 | 2021-04-13 | There are empty ("not valid") subfamilies, tribes & genera in the classification. | https://github.com/CatalogueOfLife/testing/issues/88 |
Dermaptera Species File | 1942 | 1947 | 2021-04-13 | There are two empty suborders Catadermaptera (TW: unavailable) & Protodiplyina (TW: unavailable) in the classification | https://github.com/CatalogueOfLife/testing/issues/89 |
Mantophasmatodea Species File | 25 | 27 | 2021-04-09 | Ready to be imported in CoL. | |
Orthoptera Species File | 28111 | 26439 | 2021-04-09 | (1) 23 subfamilies, 1 tribe, many genera, many species and many subspecies are in the Tree root (i.e. outside order Orthoptera; "orphan taxa" in the clearinghouse). (2) 1,672 spp less than in ac19 | https://github.com/CatalogueOfLife/testing/issues/87 |
Batch 3
Full name | #spp ac19 | #spp Import | Exported from https://sandcastle.taxonworks.org/ | Conclusion | Report |
---|---|---|---|---|---|
Embioptera Species File | 415 | 419 | 2021-04-14 | Ready to be imported in CoL. | |
Grylloblattodea Species File | 575 | 571 | 2021-04-14 | There is empty suborder Blattogryllopterida; all other taxa are under suborder NotAssigned. Empty suborder Blattogryllopterid matches TW view. CoL interpretation needs to be confirmed by the SF author. | https://github.com/CatalogueOfLife/testing/issues/91 |
Lygaeoidea Species File | 4385 | 4715 | 2021-04-14 | (1) There are 8 subspecies in the Tree root, outside superfamily Lygaeoidea; (2) There are 4 empty genera in the classification | https://github.com/CatalogueOfLife/testing/issues/93 |
Mantodea Species File | 2516 | 2471 | 2021-04-14 | (1) There are 10 subspecies in the Tree root, outside order Mantodea; (2) There are few "empty" genera in the classification. | https://github.com/CatalogueOfLife/testing/issues/92 |
There is no Isoptera Species File project in my sandcastle dashboard.
CoL used Excel data from Erick South of Jan 2018 in ac19.
@yroskov seems that one is at production already, so you'll find it at sfg.taxonworks.org
Batch 4
Full name | #spp ac19 | #spp Import | Exported from sfg.taxonworks.org | Conclusion | Report |
---|---|---|---|---|---|
Isoptera Species File | 3063 | 3063 | 2021-04-15 | (1) There are only two "empty" taxa in the Tree, as I can see | https://github.com/CatalogueOfLife/testing/issues/94 |
New data version of 2021-04-30 exported with new script from the Sandcastle on 2021-05-06.
Psocodea Species File
with ticked box "On taxon name with multiple OTUs conflict narrow selection to unlabelled ones": Imported: 11220 spp Tree:
with non-ticked box: Imported: 11175 spp Tree:
2021-06-23. 8 SFs imported to DEV from TW Sandcastle
Selected Species Files are ready for @hhopkins77:
GSD name | URL | Re-imported by | Date |
---|---|---|---|
SF Cockroach | https://data.dev.catalogueoflife.org/dataset/1051/classification | @gdower | 2021-06-18 |
SF Dermaptera | https://data.dev.catalogueoflife.org/dataset/1158/classification | @yroskov | 2021-06-23 |
SF Embioptera | https://data.dev.catalogueoflife.org/dataset/1089/classification | @yroskov | 2021-06-23 |
SF Grylloblattodea | https://data.dev.catalogueoflife.org/dataset/1170/classification | @yroskov | 2021-06-23 |
SF Mantophasmatodea | https://data.dev.catalogueoflife.org/dataset/1168/classification | @yroskov | 2021-06-23 |
SF Plecoptera | https://data.dev.catalogueoflife.org/dataset/1065/classification | @yroskov | 2021-06-23 |
SF Psocodea | https://data.dev.catalogueoflife.org/dataset/1133/classification | @yroskov | 2021-06-23 |
SF Zoraptera | https://data.dev.catalogueoflife.org/dataset/1167/classification | @yroskov | 2021-06-23 |
Thank you!
Heidi Hopkins, PhD
"She is not what you would call refined. She is not what you would call unrefined. She is the type of woman who would keep a parrot." ~Mark Twain
On Wed, Jun 23, 2021 at 2:31 PM yroskov @.***> wrote:
2021-06-23. Import to DEV from TW Sandcastle
Selected Species Files are ready for @hhopkins77 https://github.com/hhopkins77 GSD name URL Re-imported by Date SF Cockroach https://data.dev.catalogueoflife.org/dataset/1051/classification @gdower https://github.com/gdower 2021-06-18 SF Dermaptera https://data.dev.catalogueoflife.org/dataset/1158/classification @yroskov https://github.com/yroskov 2021-06-23 SF Embioptera https://data.dev.catalogueoflife.org/dataset/1089/classification @yroskov https://github.com/yroskov 2021-06-23 SF Grylloblattodea https://data.dev.catalogueoflife.org/dataset/1170/classification @yroskov https://github.com/yroskov 2021-06-23 SF Mantophasmatodea https://data.dev.catalogueoflife.org/dataset/1168/classification @yroskov https://github.com/yroskov 2021-06-23 SF Plecoptera https://data.dev.catalogueoflife.org/dataset/1065/classification @yroskov https://github.com/yroskov 2021-06-23 SF Psocodea https://data.dev.catalogueoflife.org/dataset/1133/classification @yroskov https://github.com/yroskov 2021-06-23 SF Zoraptera https://data.dev.catalogueoflife.org/dataset/1167/classification @yroskov https://github.com/yroskov 2021-06-23
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CatalogueOfLife/testing/issues/85#issuecomment-866945628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOXBKTVTH3GZLN2TL7OF653TUISBPANCNFSM42TR55SQ .
Green light from @hhopkins77 for 8 checklists (2021-07-09):
Hi Yury, I looked through the checklistbank files below and the main thing I notice is that species groups and species subgroups seem to be creating issues. The other categories of issues ("Escaped Characters", "Duplicate Name", "Parsed Name Differs", "Partially Parsable Name", "Indetermined", "Uppercase Epithet", "Inconsistent Name", "Unusual Name Characters", "Unmatched Reference Brackets", "Nomenclatural Status Invalid", "Published Before Genus") I presume have been created in the process of converting SFG to TW. So for this round I would consider these ready to upload to COL. Please let me know if you need anything further from me. Best, Heidi
@yroskov to @gdower: we need to get metadata in YAML, correct?
For attention of @mjy, @debpaul & @gdower
I have assessed 8 SF checklists exported from TW and imported into clearinghouse on DEV server.
4 of 8 checklists are in good shape and ready to be imported in CoL.
Other 4 checklists require further fixes in the exporter script. Report is below.