ablab / plasmidVerify

plasmidVerify: plasmid contig verification tool
12 stars 3 forks source link

Missing predictions when --db is set #1

Closed wwood closed 4 years ago

wwood commented 4 years ago

Hi,

I was testing out plasmidVerify with this input file, which is one randomly generated sequence and a real plasmid I got from NCBI. https://gist.github.com/wwood/5a0f84eeb4625e7c0c3e7730bb4da2ef

Running plasmidVerify wihtout the --db flag worked great - got the right answer:

./plasmidverify.py -f plasmid_and_random.fna -o plasmid_and_random.fna.output_nodb --hmm /srv/db/pfam/32/Pfam-A.hmm -t 24

output plasmid_and_random.fna.output_nodb/plasmid_and_random_result_table.csv

random_sequence_length_50000_1,Chromosome,--
MK191844.1,Plasmid,287.93,DUF4942 MTS DDE_Tnp_1_5 ParE_toxin RelB LysR_substrate (snip)

But, when I ran plasmidVerify with the --db flag, I get a different (and wrong) result:

$ ./plasmidverify.py -f plasmid_and_random.fna -o plasmid_and_random.fna.output --hmm /srv/db/pfam/32/Pfam-A.hmm -t 24 --db /srv/db/ncbi/20200116/nt

2020-04-27 15:10:13
Gene prediction...
2020-04-27 15:10:18
HMM domains prediction...
2020-04-27 15:11:55
Parsing...
Classification...
2020-04-27 15:11:55
Running BLAST...
2020-04-27 15:14:41
Parsing BLAST
2020-04-27 15:14:42
Done!

output

random_sequence_length_50000_1,Chromosome,--

Perhaps an indexing error of some description? Thanks, ben

mikeraiko commented 4 years ago

Thanks for letting me know! There is an error slipped trough because of zero HMM hits in sequence. Fixed now. Also, plasmidVerify is becoming obsolete, because it has some false positives if there are viruses in your metagenome. You may try viralVerify with plasmid option (-p) instead.

wwood commented 4 years ago

Thanks, I'll have a look at viralVerify.