viralemergence / virion

The Global Virome in One Network
https://viralemergence.github.io/virion
37 stars 8 forks source link

Issues with some is.na(Host) rows #35

Closed rorygibb closed 3 years ago

rorygibb commented 3 years ago

virion[ is.na(virion$Host), ]

Looking at the above subset table, it looks like some of the taxonomic resolving for the PREDICT rows is a bit weird - quite a few of them are given a Genus in the HostOriginal column, but have no host information resolved to that level (e.g. Akodon sp. on row 3)

Additionally, quite a few of the viruses from PREDICT appear to have VirusNCBIResolved == TRUE (presumably becuase they're resolved to virus family/genus level) but no Virus TaxIDs for that level of resolution. This may be correct but flagging it it in case it's a bug.

cjcarlson commented 3 years ago

Great catch! I'll dig into this

cjcarlson commented 3 years ago

Found the problem! It's because of a "sp." (The virus thing is, I think, correct behavior - but let me dig into that)

cjcarlson commented 3 years ago

Alright, figured out the PREDICT thing too - was missing a line I very clearly meant to write. Patch incoming.