smithlabcode / dnmtools

Tools for analyzing DNA methylation data
https://dnmtools.readthedocs.io
GNU General Public License v3.0
25 stars 8 forks source link

GitHub Downloads Install with Conda Install with Conda Install with Conda Documentation Status License: GPL v3

DNMTools is a set of tools for analyzing DNA methylation data from high-throughput sequencing experiments, especially whole genome bisulfite sequencing (WGBS), but also reduced representation bisulfite sequencing (RRBS). These tools focus on overcoming the computing challenges imposed by the scale of genome-wide DNA methylation data, which is usually the early parts of data analysis.

Installing release 1.4.3

The documentation for DNMTools can be found here. But if you want to install from source and you are reading this on GitHub or in a source tree you unpacked, then keep reading. And if you are in a terminal, sorry for all the formatting.

Required libraries

All the above can also be installed using conda. If you use conda for these dependencies, even if you are building dnmtools from the source repo, it is easiest if all dependencies are available through conda.

Configuration

Building and installing the tools

If you are still in the build directory, run make to compile the tools, and then make install to install them:

make && make install

If your HTSlib (or some other library) is not installed system-wide, then you might need to udpate your library path:

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/path/to/htslib/lib

Testing the program

To test if everything was successful, simply run dnmtools without any arguments and you should see the list of available commands:

dnmtools

There is a test suite for dnmtools and these test can be performed as follows:

make check

This must be done from the build directory. Note that the tests performed with make check are mostly regression tests that cover prior issues rather than coverage tests to test all the functionality of dnmtools.

Using a clone of the repo

We strongly recommend using DNMTools through the latest stable release under the releases section on GitHub or through a package as with conda/mamba. Developers who wish to work on the latest commits, which are unstable, can compile the source using autogen.sh which just wraps autoreconf.

Usage

Read the documentation for usage of individual tools within DNMTools.

Installing and running dnmtools docker images

The docker images of dnmtools are accessible through GitHub Container registry. These are light-weight (~30 MB) images that let you run dnmtools without worrying about the dependencies.

Installation

To pull the image for the latest version, run:

docker pull ghcr.io/smithlabcode/dnmtools

To test the image installation, run:

docker run ghcr.io/smithlabcode/dnmtools

You should see the help page of dnmtools.

For simpler reference, you can re-tag the installed image as follows, but note that you would have to re-tag the image whenever you pull an image for a new version.

docker tag ghcr.io/smithlabcode/dnmtools:latest dnmtools:latest

You can also install the image for a particular vertion by running

docker pull ghcr.io/smithlabcode/dnmtools:v[VERSION NUMBER] #(e.g. v1.4.3)

Not all versions have corresponding images; you can find available images here.

Running the docker image

To run the image, you can run (assuming you tagged the image as above)

docker run -v /path/to/data:/data -w /data \
  dnmtools [DNMTOOLS COMMAND] [OPTIONS] [ARGUMENTS]

In the above command, replace /path/to/data with the path to the directory you want to mount, and it will be mounted as the /data directory in the container. For example, if your genome data genome.fa is located in ./genome_data, you can execute abismalidx by running:

docker run -v ./genome_data:/data -w /data \
  dnmtools abismalidx -v -t 4 genome.fa genome.idx

In the above command, -w /data specifies the working directory in the container, so the output genome.idx is saved in the /data directory, which corresponds to the ./genome_data directory in the host machine. If you want to specify the output directory, use a command like below.

docker run -v ./genome_data:/data -w /data \
  -v ./genome_index:/output \
  dnmtools abismalidx -v -t 4 genome.fa /output/genome.idx

When you need to access multiple directories, it might be useful to use the option -v ./:/app -w /app, which mounts the current directory to the /app directory in the container, which is alo set as the working directory. You can specify the paths in the same way you would from the working directory in the host machine. For example:

docker run -v ./:/app -w /app \
  dnmtools abismal -i genome_index/genome.idx -v -t 4 \
  -o mapped_reads/output.sam \
  reads/reads_1.fq reads/reads_1.fq

Testing the install and use of docker image

Run the following commands to test the installation and usage of the docker image of dnmtools.

docker pull ghcr.io/smithlabcode/dnmtools:latest
docker tag ghcr.io/smithlabcode/dnmtools:latest dnmtools:latest

# Clone the repo to access test data
git clone git@github.com:smithlabcode/dnmtools.git
cd dnmtools

# Run containers and save outputs in artifacts directory

mkdir artifacts

docker run -v ./:/app -w /app \
  dnmtools abismalidx -v -t 1 data/tRex1.fa artifacts/tRex1.idx

docker run -v ./:/app -w /app \
  dnmtools simreads -seed 1 -o artifacts/simreads -n 10000 \
  -m 0.01 -b 0.98 data/tRex1.fa

docker run -v ./:/app -w /app \
  dnmtools abismal -v -t 1 -i artifacts/tRex1.idx artifacts/simreads_{1,2}.fq

Contacts and bug reports

Andrew D. Smith andrewds@usc.edu

Copyright and License Information

Copyright (C) 2022-2024 Andrew D. Smith and Guilherme de Sena Brandine

Authors of DNMTools: Andrew D. Smith and Guilherme de Sena Brandine

Essential contributors: Ben Decato, Meng Zhou, Liz Ji, Terence Li, Jenny Qu, Qiang Song, Fang Fang and Masaru Nakajima

This is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.