ctlab / metacherchant

MIT License
18 stars 9 forks source link
bioinformatics metagenomics

MetaCherchant is a tool for analysing genomic environment of a nucleotide sequence within a metagenome. The implementation is based on MetaFast source code.

Starting from version 0.1.0 it supports Hi-C reads input to facilitate genomic context extraction for extrachromosomal DNA. For detailed instructions, please refer to wiki page.

It also provides user with tools for comparing two metagenomes. For more details, please consult the reads classifier description.

Content

========

Installation

To run MetaCherchant you need to have JRE version 1.8 or higher installed and either of these three files: metacherchant.sh for Linux/MacOS, metacherchant.bat for Windows or metacherchant.jar for any OS.

Running MetaCherchant

To run MetaCherchant use the following syntax:

Usage example

Single-metagenome mode

Here is a bash script showing a typical usage of MetaCherchant:

./metacherchant.sh --tool environment-finder \
    --k 31 \
    --coverage=5 \
    --reads $READS_DIR/*.fasta \
    --seq $GENE_FILE.fasta \
    --output $OUTPUT_DIR/output/ \
    --work-dir $OUTPUT_DIR/workDir \
    --maxkmers=100000 \
    --bothdirs=False \
    --chunklength=100

After the end of analysis, found metagenomic environment can be visualised using de Bruijn graph, as on the figure below. For more information see output description section.

Single-metagenome environment

adeC gene in genome context of E.faecium. Target AR gene is shown in red.

Differential (multiple-metagenome) mode

In this mode, it is possible to join two or more graphs constructed as described above and join them into a single graph. The example command is:

./metacherchant.sh \
    --tool environment-finder-multi \
    --seq OXA-347.fasta \
    --work-dir "k31/TUE-S2_3_4/workDir" \
    --output "k31/TUE-S2_3_4" \
    --env "k31/TUE-S2_3/output/env.txt" "k31/TUE-S2_4/output/env.txt"

After the end of analysis, found metagenomic environment can be visualised using de Bruijn graph, as on the figure below. For more information see output description section.

Multiple-metagenome environment

Combined graph of AR gene context produced from two metagenomes of the same subject. Red color denotes the part of the graph present only at the time point 2, blue color — only at the point 3, black — at both points; green color denotes the graph nodes corresponding to the target AR gene

Output description

After the end of the analysis, the results can be found in the folder specified in --output parameter (if there were multiple sequences in file in --seq, there will be separate folder for each one).

Citation

If you use MetaCherchant in your research, please cite the following publication:

Olekhnovich, E. I., Vasilyev, A. T., Ulyantsev, V. I., Kostryukova, E. S., & Tyakht, A. V. (2018). MetaCherchant: analyzing genomic context of antibiotic resistance genes in gut microbiota. Bioinformatics, 34(3), 434-444. https://doi.org/10.1093/bioinformatics/btx681