Metagenomic Intra-Species Diversity Analysis (MIDAS) is an integrated pipeline for profiling strain-level genomic variations in shotgun metagenomic data. The standard MIDAS workflow harnesses a reference database of 5,926 species extracted from 30,000 genomes (MIDAS DB v1.2). MIDAS2 used the same analysis workflow as the original MIDAS tool, and is engineered to work with more comprehensive MIDAS Reference Databases (MIDASDBs), and to run on collections of thousands of samples in a fast and scalable manner.
For MIDAS2, we have already built two MIDASDBs from large, public, microbial genome databases: UHGG 1.0 and GTDB r202.
Publication is available in Bioinformatics. User manual is available at ReadTheDocs.
The performance of reads mapping based metagenotyping pipeline depends on (1) how closely related the DB reference genomes are to the strains in the samples being genotyped, and (2) post-alignment filter options, and etc. Pitfalls of genotyping microbial communities with rapidly growing genome collections can be found here.
Quick Installation:
conda create -n midas2 -c zhaoc1 -c conda-forge -c bioconda -c anaconda -c defaults midas
MIDAS version 3, previously known as MIDAS2, features major updates to its pangenome database. These updates include a refinded curation process and a comprehensive functional annotation pipeline. MIDASDB can construct species-level pangenome databases from external reference genome collections, e.g. UHGG or GTDB, by clustering predicted genes into operational gene families (OGFs) at various average nucleotide identity (ANI) thresholds, with representative gene sequences of each OGF assigned as the centroids by vsearch.
The first step is to generate the pruned centroids sequences for species of interests.
midas prune_centroids --midasdb_name localdb --midasdb_dir /path/to/midasdb-uhgg-v2 -t 1 --remove_singleton --species 100001 --force --debug
The second step is to pass the arguments to run_genes
midas run_genes --midasdb_name localdb --midasdb_dir /path/to/midasdb-uhgg-v2 --num_cores 8 --select_threshold=-1 --species_list 100723,104323,100041 --prune_centroids --remove_singleton midas_output
Details of these updates can be found at the provided link.
Quick Installation:
conda config --set channel_priority flexible
conda create -n midasv3 -c zhaoc1 -c conda-forge -c bioconda -c anaconda -c defaults midasv3=1.0.0
bash tests/test_analysis.sh 8