viralemergence / virion

The Global Virome in One Network
https://viralemergence.github.io/virion
37 stars 8 forks source link

One instance of HostFamily is in single quotes #38

Open maxfarrell opened 3 years ago

maxfarrell commented 3 years ago

'percalatidae' is the only family name with quotes around it

require(dplyr)
require(vroom)

virion <- vroom("Virion/Virion.csv.gz")
head(sort(unique(virion$HostFamily))) # 'percalatidae'
cjcarlson commented 3 years ago

This is an NCBI taxonomy bug and therefore assigned to the Ryanverse:

> classification("percalates novemaculeata", db="ncbi")
==  1 queries  ===============

Retrieving data for taxon 'percalates novemaculeata'

√  Found:  percalates+novemaculeata
==  Results  =================

* Total: 1 
* Found: 1 
* Not Found: 0
$`percalates novemaculeata`
                       name         rank      id
1        cellular organisms      no rank  131567
2                 Eukaryota superkingdom    2759
3              Opisthokonta        clade   33154
4                   Metazoa      kingdom   33208
5                 Eumetazoa        clade    6072
6                 Bilateria        clade   33213
7             Deuterostomia        clade   33511
8                  Chordata       phylum    7711
9                  Craniata    subphylum   89593
10               Vertebrata        clade    7742
11            Gnathostomata        clade    7776
12               Teleostomi        clade  117570
13             Euteleostomi        clade  117571
14           Actinopterygii   superclass    7898
15              Actinopteri        class  186623
16              Neopterygii     subclass   41665
17                Teleostei   infraclass   32443
18      Osteoglossocephalai        clade 1489341
19            Clupeocephala      no rank  186625
20        Euteleosteomorpha       cohort 1489388
21             Neoteleostei        clade  123365
22             Eurypterygia        clade  123366
23            Ctenosquamata        clade  123367
24          Acanthomorphata        clade  123368
25       Euacanthomorphacea        clade  123369
26          Percomorphaceae        clade 1489872
27               Eupercaria        clade 1489922
28         Centrarchiformes        order 1489940
29            Percalatoidei  superfamily 1545904
30           'Percalatidae'       family 1545905
31               Percalates        genus 1545907
32 Percalates novemaculeata      species   45783

attr(,"class")
[1] "classification"
attr(,"db")
[1] "ncbi"
cjcarlson commented 3 years ago

Ryan, we've got a few small things for the tax team, but most of them probably require an actual conversation - this weird one is small enough and self explanatory, though, if you wanted to ping it to them

cjcarlson commented 3 years ago

So this is a bit odd:

> classification("percalates novemaculeata", db="ncbi")
==  1 queries  ===============

Retrieving data for taxon 'percalates novemaculeata'

√  Found:  percalates+novemaculeata
==  Results  =================

* Total: 1 
* Found: 1 
* Not Found: 0
$`percalates novemaculeata`
                       name         rank      id
1        cellular organisms      no rank  131567
2                 Eukaryota superkingdom    2759
3              Opisthokonta        clade   33154
4                   Metazoa      kingdom   33208
5                 Eumetazoa        clade    6072
6                 Bilateria        clade   33213
7             Deuterostomia        clade   33511
8                  Chordata       phylum    7711
9                  Craniata    subphylum   89593
10               Vertebrata        clade    7742
11            Gnathostomata        clade    7776
12               Teleostomi        clade  117570
13             Euteleostomi        clade  117571
14           Actinopterygii   superclass    7898
15              Actinopteri        class  186623
16              Neopterygii     subclass   41665
17                Teleostei   infraclass   32443
18      Osteoglossocephalai        clade 1489341
19            Clupeocephala      no rank  186625
20        Euteleosteomorpha       cohort 1489388
21             Neoteleostei        clade  123365
22             Eurypterygia        clade  123366
23            Ctenosquamata        clade  123367
24          Acanthomorphata        clade  123368
25       Euacanthomorphacea        clade  123369
26          Percomorphaceae        clade 1489872
27               Eupercaria        clade 1489922
28         Centrarchiformes        order 1489940
29            Percalatoidei  superfamily 1545904
30         Percalates-clade        clade 1545905
31               Percalates        genus 1545907
32 Percalates novemaculeata      species   45783

attr(,"class")
[1] "classification"
attr(,"db")
[1] "ncbi"