jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
225 stars 31 forks source link

How to generate viral-affi-contigs-for-dramv.tab for dramv on viral genomes from multiple tools? #171

Open krausfeldtle opened 1 year ago

krausfeldtle commented 1 year ago

Hello! I'd like to use multiple tools for viral genome detection in metagenomes in addition to vs2, and then I want to run the final set of genomes through dramv (after checkv filtering). However, running this compilation of genomes from multiple tools through vs2 again filters out many contigs detected by other tools. Therefore I cannot generate the file viral-affi-contigs-for-dramv.tab for dramv for all the genomes of interest.

Do you have any other suggestions for being able to generate the file viral-affi-contigs-for-dramv.tab from multiple viral tools outputs?

I have tried to make adjustments to the template-config.yaml file to output the same number of contigs I had input, but I have not had any luck.

Thanks in advance for your time!

jiarong commented 1 year ago

Hi, you can try something like below, basically turning off all the filters. Note that some short sequences w/ < 2 genes are still filtered.

virsorter run --seqname-suffix-off --viral-gene-enrich-off --provirus-off --prep-for-dramv -i checkv/combined.fna -w vs2-pass2 --min-length 0 --min-score 0
krausfeldtle commented 1 year ago

Thanks for your response! I did try that and it does indeed filter out many still (sometimes up to one third). Is there any way to change that particular filter? Like in the template-config file? Thanks very much!

jiarong commented 1 year ago

That minimal gene # filter can not be turned off.. VirSorter2 is a gene based tool, so many "features" require at least 2 genes to be calculated. Also I would say if a contig has less then 2 genes, it does not make sense to use DRAMv to annotate it. You use any gene annotation tools that not specific for viruses.