jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
217 stars 28 forks source link

Viral contigs wrong classification? #52

Open ntromas opened 3 years ago

ntromas commented 3 years ago

Hi,

I used VirSorter v2.1 on contigs from environmental metagenomes. I selected potential viral contigs with score >0.8. I then verify completeness with checkV. Several potential viral contigs where classified as "not viral" because they did not have viral gene. I decided to select the one that only have %viral (VirSorter2 output). I used kraken2 and they were all bacteria (cyanobacteria). Only 10% were assigned to virus, using kraken2 (96% of contigs were successfully classified). I wonder if I did something wrong or if there is a way to make the classification much more conservative by any chance?

Thanks for your help,

Nico

jiarong commented 3 years ago

Your sample is bulk metaG, not virome, right? More stringent score cutoff, 0.9 or even 0.95 is recommended.

ntromas commented 3 years ago

Ok, I'll try that, thanks!

ntromas commented 3 years ago

Hi again, so I just tried (0.95) and analyze the output with checkV. I still have around ~30% with "no viral genes detected".

--

jiarong commented 3 years ago

CheckV is very conservative on calling viral genes. Manually check on those with 0 checkV viral genes are recommended.