marbl / MetaCompass

MetaCompass: Reference-guided Assembly of Metagenomes
https://github.com/marbl/MetaCompass/wiki
Other
38 stars 11 forks source link

MetaCompass v2.0-beta

Last updated: January 31th, 2021

Publication

Victoria Cepeda, Bo Liu, Mathieu Almeida, Christopher M. Hill, Sergey Koren, Todd J. Treangen, Mihai Pop. bioRxiv 212506; doi: https://doi.org/10.1101/212506

Required software:

Memory and Disk Space Requirements.

INSTALLATION From Source:

Get the Latest release from https://github.com/marbl/MetaCompass/releases:

wget https://github.com/marbl/MetaCompass/archive/1.xx.tar.gz
tar -xzvf 1.xx.tar.gz
cd MetaCompass-1.xx
./install.sh

INSTALLATION Using Git:

git clone https://github.com/marbl/MetaCompass.git
cd MetaCompass
./install.sh

USAGE

-- I have a set of metagenomic reads, and want to perform reference-guided assembly.

python3 go_metacompass.py -1 [read1.fq] -2 [read2.fq] -l [max read length] -o [output_folder] -m [min coverage] -t [ncpu] -y [memory GB]

-- I know the reference genomes, or I want to perform comparative assembly for a particular genome.

python3 go_metacompass.py -r [references.fasta] -1 [read1.fq] -2 [read2.fq] -o [output_folder] -m [min coverage] -t [ncpu] -y [memory GB]

OUTPUT

-- metacompass_output folder contains the following files:

File Description
metacompass.final.ctg.fa Assembled contigs
metacompass_mapping_stats.tsv Mapped reads general stats
metacompass_mapping_pergenome_stats.tsv Mapped reads stats per genome
metacompass.genomes_coverage.txt Breadth of coverage per genome
metacompass.references.fna References used to guide assembly
metacompass_assembly_stats.tsv Assembly general stats
metacompass_assembly_pergenome_stats Assembly stats per genome
metacompass_summary.tsv Metadata

EXAMPLES

Reference-guided assembly with known reference genomes (no reference selection).

-- Input data is available in the tutorial folder:

Reference genome file:  Candidatus_Carsonella_ruddii_HT_Thao2000.fasta
Metagenomic reads:      thao2000.1.fq
                        thao2000.2.fq   

-- Run:

 python3 go_metacompass.py -r tutorial/Candidatus_Carsonella_ruddii_HT_Thao2000.fasta -1 tutorial/thao2000.1.fq -2 tutorial/thao2000.2.fq -l 150 -o example1_output -t 4 -y 8

Reference-guided assembly with reference selection.

-- Download and extract metagenomic sample:

wget http://downloads.hmpdacc.org/dacc/hhs/genome/microbiome/wgs/analysis/hmwgsqc/v2/SRS044742.tar.bz2
tar -xvf SRS044742.tar.bz2

-- The metagenomic sample contains:

SRS044742/
    SRS044742.denovo_duplicates_marked.trimmed.1.fastq
    SRS044742.denovo_duplicates_marked.trimmed.2.fastq
    SRS044742.denovo_duplicates_marked.trimmed.singleton.fastq

-- Run:

 python3 go_metacompass.py -1 SRS044742/SRS044742.denovo_duplicates_marked.trimmed.1.fastq -2 SRS044742/SRS044742.denovo_duplicates_marked.trimmed.2.fastq -U SRS044742/SRS044742.denovo_duplicates_marked.trimmed.singleton.fastq -l 100 -o example2_output -t 1 -y 8

Contact: vcepeda@umd.edu