ablab / quast

Genome assembly evaluation tool
http://quast.sf.net
Other
409 stars 78 forks source link
bioinformatics contigs genome-assembly-evaluation visualization

GitHub release (latest by date) BioConda Install SourceForge Download QUAST PyPI version GitHub Downloads License

Genome assembly evaluation tool

QUAST stands for QUality ASsessment Tool. It evaluates genome/metagenome assemblies by computing various metrics. The current QUAST toolkit includes the general QUAST tool for genome assemblies, MetaQUAST, the extension for metagenomic datasets, QUAST-LG, the extension for large genomes (e.g., mammalians), and Icarus, the interactive visualizer for these tools.

The QUAST package works both with and without reference genomes. However, it is much more informative if at least a close reference genome is provided along with the assemblies. The tool accepts multiple assemblies, thus is suitable for comparison.

This README file gives a brief introduction into installation, basic usage and parsing of output of QUAST. A much more detailed description of these and many other topics is available in the online manual. There are also many more installation methods for the latest stable release of the QUAST toolkit, all of them are discussed here. For the cutting-edge version, please clone our GitHub repo.

The Gurevich Lab at the Helmholtz Institute for Pharmaceutical Research Saarland (HIPS) currently maintains and develops the tool. For copyright information and citation instructions, please refer to LICENSE.txt. We warmly welcome external contributions to the QUAST project. If you would like to contribute, please review our Contributor Covenant.

System requirements

Linux 64-bit and macOS are supported.

For the main pipeline:

For the optional submodules:

Most of those tools are usually preinstalled on Linux. MacOS, however, requires to install the Command Line Tools for Xcode to make them available.

QUAST draws plots in two formats: HTML and PDF. If you need the PDF versions, make sure that you have installed Matplotlib. We recommend to use Matplotlib version 1.1 or higher. QUAST is fully tested with Matplotlib v.1.3.1. Installation on Ubuntu (tested on Ubuntu 20.04):

sudo apt-get update && sudo apt-get install -y pkg-config libfreetype6-dev libpng-dev python3-matplotlib

Installation

QUAST automatically compiles all its sub-parts when needed (on the first use). Thus, installation is not required. However, if you want to precompile everything and add quast.py to your PATH, you may choose either:

Basic installation (about 120 MB):

./setup.py install

Full installation (about 540 MB, includes (1) tools for SV detection based on read pairs, which is used for more precise misassembly detection, (2) and tools/data for reference genome detection in metagenomic datasets):

./setup.py install_full

The default installation location is /usr/local/bin/ for the executable scripts, and /usr/local/lib/ for the python modules and auxiliary files. If you are getting a permission error during the installation, consider running setup.py with sudo, or create a virtual python environment and install into it. Alternatively, you may use old-style installation scripts (./install.sh or ./install_full.sh), which build QUAST package inplace.

Usage

./quast.py test_data/contigs_1.fasta \
           test_data/contigs_2.fasta \
        -r test_data/reference.fasta.gz \
        -g test_data/genes.txt \
        -1 test_data/reads1.fastq.gz -2 test_data/reads2.fastq.gz \
        -o quast_test_output

Output

report.txt      summary table
report.tsv      tab-separated version, for parsing, or for spreadsheets (Google Docs, Excel, etc)  
report.tex      Latex version
report.pdf      PDF version, includes all tables and plots for some statistics
report.html     everything in an interactive HTML file
icarus.html     Icarus main menu with links to interactive viewers
contigs_reports/        [only if a reference genome is provided]
  misassemblies_report  detailed report on misassemblies
  unaligned_report      detailed report on unaligned and partially unaligned contigs
k_mer_stats/            [only if --k-mer-stats is specified]
  kmers_report          detailed report on k-mer-based metrics
reads_stats/            [only if reads are provided]
  reads_report          detailed report on mapped reads statistics

Metrics based only on contigs:

When a reference is given:

Contact & Info