I would love to get Blast2lca working as it claims to do exactly what I need, and seems fast and nicely put together. However, the results I'm getting from it contain mostly unknown values, or are from the wrong kingdom! Perhaps I'm doing something horribly wrong...
I'm using BLAST+ 2.2.31, and doing nt query searches with blastn against the NCBI Virus RefSeq nt database. I believe that all of the GIs used in this DB are all contained within the master taxonomy – do tell me if I am mistaken.
Example BLAST+ 2.2.31 output for 1 particular sequence (which gives unknown lca). I'm including this because I know the NCBI sometime changes the format by accident.
I would love to get Blast2lca working as it claims to do exactly what I need, and seems fast and nicely put together. However, the results I'm getting from it contain mostly unknown values, or are from the wrong kingdom! Perhaps I'm doing something horribly wrong...
I'm using BLAST+ 2.2.31, and doing nt query searches with blastn against the NCBI Virus RefSeq nt database. I believe that all of the GIs used in this DB are all contained within the master taxonomy – do tell me if I am mistaken.
Command:
Example BLAST+ 2.2.31 output for 1 particular sequence (which gives unknown lca). I'm including this because I know the NCBI sometime changes the format by accident.
However, more interestingly still, I get completely wrong results for a small fraction of results which are not unknown. Blast2lca output:
...Yet the BLAST results are from a search against a viral subset of RefSeq. The GI of the sole blast result points to a hepatitis virus!
Any wisdom regarding what might possibly be going on here would be gratefully received!
Thank you.