Closed rorygibb closed 3 years ago
Great catch! I'll dig into this
Found the problem! It's because of a "sp." (The virus thing is, I think, correct behavior - but let me dig into that)
Alright, figured out the PREDICT thing too - was missing a line I very clearly meant to write. Patch incoming.
virion[ is.na(virion$Host), ]
Looking at the above subset table, it looks like some of the taxonomic resolving for the PREDICT rows is a bit weird - quite a few of them are given a Genus in the HostOriginal column, but have no host information resolved to that level (e.g. Akodon sp. on row 3)
Additionally, quite a few of the viruses from PREDICT appear to have VirusNCBIResolved == TRUE (presumably becuase they're resolved to virus family/genus level) but no Virus TaxIDs for that level of resolution. This may be correct but flagging it it in case it's a bug.