faylward / viralrecall

Detection of NCLDV signatures in 'omic data
30 stars 11 forks source link

How to get the score of entire bin just like the result in your essay #13

Closed Achuan-2 closed 2 years ago

Achuan-2 commented 2 years ago

I have some bins. And I want to use viralrecall to screen them. The ideal output could be a csv(tsv), one column for the MAG, one column for scores. I want to ask how to get the score of entire bin just like the result in your essay image And I don't understand the parameter "-c" it also retrun many replicons image

faylward commented 2 years ago

The -c option provides contig-level statistics (not bin). The figure shows contig-level stats too, it's just that some genomes have only 1 contig (1 replicon). If you wanted bin-level stats you would need to post-process the data to average the score for each contig in a bin.

faylward commented 2 years ago

The purpose of the -c option is to identify contigs with low scores that may not be viral (and should therefore be excluded from the bin).

Achuan-2 commented 2 years ago

The -c option provides contig-level statistics (not bin). The figure shows contig-level stats too, it's just that some genomes have only 1 contig (1 replicon). If you wanted bin-level stats you would need to post-process the data to average the score for each contig in a bin.

ok ,thanks a lot. So the cut-off for mean score of bin should be set to >1 ?or set a percentage(maybe one of five is positive) ?

faylward commented 2 years ago

I would look at the scores for each contig individually- if a contig has a very low score (<0) you may want to exclude it from your bin.