mijiandui / metav

2 stars 0 forks source link

virus clsuting #1

Closed ChaoXianSen closed 10 months ago

ChaoXianSen commented 10 months ago

Hi ! how to process virus clustering? What tools are available?

thanks a lot !

mijiandui commented 10 months ago

Dear ChaosXianSen, We used support code from checkV to process virus clustering. Thanks Jiandui Mi

ChaoXianSen commented 10 months ago

6208140a2f8be395141cbe489b7ffcd

CheckV v1.0.1: assessing the quality of metagenome-assembled viral genomes https://bitbucket.org/berkeleylab/checkv

usage: checkv [options] programs: end_to_end run full pipeline to estimate completeness, contamination, and identify closed genomes contamination identify and remove host contamination on integrated proviruses completeness estimate completeness for genome fragments complete_genomes identify complete genomes based on terminal repeats and flanking host regions quality_summary summarize results across modules download_database download the latest version of CheckV's database update_database update CheckV's database with your own complete genomes

optional arguments: -h, --help show this help message and exit ;

Dear @mijiandui Can you indicate which script was used to cluster virus ?

thanks a lot!

mijiandui commented 10 months ago

please find on the website: https://bitbucket.org/berkeleylab/checkv/src/master/ Supporting code Rapid genome clustering based on pairwise ANI

First, create a blast+ database: makeblastdb -in -dbtype nucl -out

Next, use megablast from blast+ package to perform all-vs-all blastn of sequences: blastn -query -db -outfmt '6 std qlen slen' -max_target_seqs 10000 -o -num_threads 32

Next, calculate pairwise ANI by combining local alignments between sequence pairs: anicalc.py -i -o

Finally, perform UCLUST-like clustering using the MIUVIG recommended-parameters (95% ANI + 85% AF): aniclust.py --fna --ani --out --min_ani 95 --min_tcov 85 --min_qcov 0

Thanks

ChaoXianSen commented 10 months ago

Dear @mijiandui

Thanks a lot, thank you !