Open mdoering opened 6 years ago
By investigating into gbif/checklistbank#45 we found there are 693.344 unique name records in CLB that all have the canonical name "spec.". Initial scanning of the names suggests this is both a name parsing issue and how the canonical name is build.
Here are a few:
prod_checklistbank=> SELECT * FROM name n WHERE lower(n.canonical_name) = lower('spec.') limit 50; id | scientific_name | canonical_name | type | genus_or_above | infra_generic | specific_epithet | infra_specific_epithet | cultivar_epithet | notho_type | authors_parsed | autho ---------+---------------------------------------------------+----------------+----------+----------------+---------------+------------------+------------------------+------------------+------------+----------------+------ 4912135 | Gyrocarpus sp. Chase 317 | spec. | INFORMAL | Gyrocarpus | Chase | | | | | f | 4912136 | Gyrocarpus sp. DES-2011 | spec. | INFORMAL | Gyrocarpus | Des- | | | | | t | 4912338 | Gyrodactylus sp. AGV-2009a | spec. | INFORMAL | Gyrodactylus | Agv- | | | | | t | 4912339 | Gyrodactylus sp. AGV-2009b | spec. | INFORMAL | Gyrodactylus | Agv- | | | | | t | 4912343 | Gyrodactylus sp. Chile | spec. | INFORMAL | Gyrodactylus | Chile | | | | | t | 4912363 | Gyrodactylus sp. HSS-2009 | spec. | INFORMAL | Gyrodactylus | Hss- | | | | | t | 4912385 | Gyrodactylus sp. Ladoga | spec. | INFORMAL | Gyrodactylus | Ladoga | | | | | t | 4912386 | Gyrodactylus sp. Ladoga x Gyrodactylus pannonicus | spec. | HYBRID | | | | | | | f | 4912388 | Gyrodactylus sp. MBS-2014 | spec. | INFORMAL | Gyrodactylus | Mbs- | | | | | t | 4912390 | Gyrodactylus sp. MPV-2015 | spec. | INFORMAL | Gyrodactylus | Mpv- | | | | | t | 4912394 | Gyrodactylus sp. NKA-2015 | spec. | INFORMAL | Gyrodactylus | Nka- | | | | | t | 4912396 | Gyrodactylus sp. North Sea | spec. | INFORMAL | Gyrodactylus | North | | | | | t | Sea 4912398 | Gyrodactylus sp. Norway-HH-2003 | spec. | INFORMAL | Gyrodactylus | Norway- | | | | | f | 4912407 | Gyrodactylus sp. Poland-MZ-2003 | spec. | INFORMAL | Gyrodactylus | Poland- | | | | | f | 4912413 | Gyrodactylus sp. Zimbabwe | spec. | INFORMAL | Gyrodactylus | Zimbabwe | | | | | t | 4912428 | Gyrodactylus spec. Nordmann, 1832 | spec. | INFORMAL | Gyrodactylus | Nordmann | | | | | t | 4912565 | Gyrocotyle sp. Tasmania | spec. | INFORMAL | Gyrocotyle | Tasmania | | | | | t | 4913863 | Gyrodinium sp. GeoB 231 | spec. | INFORMAL | Gyrodinium | Geo | | | | | f | 4913996 | Gyromitra sp. Gyr3 | spec. | INFORMAL | Gyromitra | Gyr | | | | | f | 4914171 | Gyrodactylus pannonicus X Gyrodactylus sp. Ladoga | spec. | HYBRID | | | | | | | f | 4914296 | Gyrodactylus pomeraniae x Gyrodactylus lavareti | spec. | HYBRID | | | | | | | f | 4914360 | Gyroneuron sp. BOLD:AAI1989 | spec. | INFORMAL | Gyroneuron | Bold | | | | | t | Aai 4914380 | Gyroneuronella sp. AZR-2008 | spec. | INFORMAL | Gyroneuronella | Azr- | | | | | t | 4914558 | Gyrodontium sp. BAB-5180 | spec. | INFORMAL | Gyrodontium | Bab- | | | | | f | 4915887 | Gyrostemon sp. Cranfield 02068672 | spec. | INFORMAL | Gyrostemon | Cranfield | | | | | f | 4916067 | Gyroporus sp. AWW-2009a | spec. | INFORMAL | Gyroporus | Aww- | | | | | t | 4916070 | Gyroporus sp. Arora 00-429 | spec. | INFORMAL | Gyroporus | Arora | | | | | f | 4916072 | Gyroporus sp. Arora00-429 | spec. | INFORMAL | Gyroporus | Arora | | | | | f | 4916945 | Gyrovirus 4 | spec. | VIRUS | | | | | | | f | 4916951 | Gyrovirus GyV3 | spec. | VIRUS | | | | | | | f | 4916954 | Gyrovirus GyV7-SF | spec. | VIRUS | | | | | | | f | 4916956 | Gyrovirus GyV8 | spec. | VIRUS | | | | | | | f | 4916959 | Gyrovirus GyV9 | spec. | VIRUS | | | | | | | f | 4916962 | Gyrovirus Tu243 | spec. | VIRUS | | | | | | | f | 4916965 | Gyrovirus Tu789 | spec. | VIRUS | | | | | | | f | 4916977 | Gyrovirus: Chicken anemia virus ICTV | spec. | VIRUS | | | | | | | f | 4916980 | Gyrovirus: chicken anemia virus Ictv | spec. | VIRUS | | | | | | | f | 4917282 | Gyrophyllum sp. NTM-C014392 | spec. | INFORMAL | Gyrophyllum | Ntm- | | | | | f | 4917368 | Gyrtona sp. BOLD:AAI6410 | spec. | INFORMAL | Gyrtona | Bold | | | | | f | 4917371 | Gyrtona sp. Gyrt | spec. | INFORMAL | Gyrtona | Gyrt | | | | | t | 4917446 | Gyrtothripa sp. BOLD:AAH4619 | spec. | INFORMAL | Gyrtothripa | Bold | | | | | f | 4917650 | H-1 parvovirus | spec. | VIRUS | | | | | | | f | 4917660 | H-Pelican lacZ transformation vector | spec. | VIRUS | | | | | | | f | 4917662 | H-Stinger GFP transformation vector | spec. | VIRUS | | | | | | | f | 4918192 | HCBI8.215 virus | spec. | VIRUS | | | | | | | f |
The full list of all spec. names is attached.
many of those names are from NCBI. Genus sp. XYZ is a very common structure we should detect and mark
By investigating into gbif/checklistbank#45 we found there are 693.344 unique name records in CLB that all have the canonical name "spec.". Initial scanning of the names suggests this is both a name parsing issue and how the canonical name is build.
Here are a few:
The full list of all spec. names is attached.