jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
219 stars 30 forks source link

issue of output file #56

Open sjhxy91 opened 3 years ago

sjhxy91 commented 3 years ago

hi jiarong et al, thanks for your nice code. I'm trying use virsorter2 to find some RNA virus from meta-transcriptome data, there are three output file I got, (final_vira_score.tsv; final_viral_combined.fa; boundary.tsv ). I found the number of viral contigs in boundary.tsv is more than final_vira_score.tsv, and the viral contig is the same between final_vira_score.tsv and final_viral_combined.fa. so which can be defined RNA virus ? boundary.tsv or final_vira_score.tsv ? the command I use is "virsorter run --prep-for-dramv -w output.out -i input.fasta --include-groups RNA -j 4 all" thanks for your help

jiarong commented 3 years ago

The final_viral_score.tsv and final_combined.fa should be final results. There are some records in the boundary file removed in final results. I should mark it as a intermediate file.. Also a few caveats on RNA virus results: 1) a score cutoff >=0.95 is recommended; 2) only contigs with hallmark genes are high confidence hits and the rest should be manually checked.

sjhxy91 commented 3 years ago

The final_viral_score.tsv and final_combined.fa should be final results. There are some records in the boundary file removed in final results. I should mark it as a intermediate file.. Also a few caveats on RNA virus results: 1) a score cutoff >=0.95 is recommended; 2) only contigs with hallmark genes are high confidence hits and the rest should be manually checked.

got it, thanks for your reply have a nice day