madagiurgiu25 / decoil-pre

Reconstruct ecDNA from long-read data using Decoil tool
BSD 3-Clause "New" or "Revised" License
8 stars 0 forks source link

Coverage GitHub license

Decoil

Decoil (deconvolve extrachromosomal circular DNA isoforms from long-read data) is a software package for reconstruction circular DNA.

Getting started using conda and pip

Assumes you have conda installed.

conda create -n envdecoil -c bioconda -c conda-forge python==3.10 survivor==1.0.7 sniffles==1.0.12 deeptools==3.5.5 ngmlr==0.2.7 samtools==1.15.1 python-dateutil==2.8.0
conda activate envdecoil
python -m pip install decoil==1.1.3

decoil --version

Getting started using docker or singularity

As a prequisite you need to have install docker or singularity (you can install this from the official website or using conda).

1.1 Download as docker image

Download decoil docker image from docker-hub. This contains all the dependencies needed to run the software. No additional installation needed. All the environment, packages, dependencies are all specified in the docker/singularity image.

# docker
docker pull madagiurgiu25/decoil:1.1.2-slim

1.2 Download as singularity image

# singularity
singularity pull decoil.sif  docker://madagiurgiu25/decoil:1.1.2-slim


2. Run example using docker or singularity

To test your installation check the Example.


3. Run Decoil reconstruction using docker or singularity

To run Decoil on your data you need to cofigure the following parameters:

# run decoil with your input with standard parameters
BAM_INPUT="<absolute path to your BAM file>"
OUTPUT_FOLDER="<absolute path to your output folder>"
NAME="<sample name>"
GENOME="<absolute path to your reference genome file>"
ANNO="<absolute path to your gtf annotation file>"

and then run the following command:

# docker
docker run -it --platform=linux/amd64 \
    -v ${BAM_INPUT}:/data/input.bam \
    -v ${BAM_INPUT}.bai:/data/input.bam.bai \
    -v ${GENOME}:/annotation/reference.fa \
    -v ${ANNO}:/annotation/anno.gtf \
    -v ${OUTPUT_FOLDER}:/mnt \
    -t madagiurgiu25/decoil:1.1.2-slim \
    decoil-pipeline sv-reconstruct \
            -b /data/input.bam \
            -r /annotation/reference.fa \
            -g /annotation/anno.gtf \
            -o /mnt --name ${NAME}
# singularity
mkdir -p ${OUTPUT_FOLDER}
mkdir -p ${OUTPUT_FOLDER}/logs
mkdir -p ${OUTPUT_FOLDER}/tmp
singularity run \
    --bind ${OUTPUT_FOLDER}/logs:/mnt/logs \
    --bind ${OUTPUT_FOLDER}/tmp:/tmp \
    --bind ${BAM_INPUT}:/data/input.bam \
    --bind ${BAM_INPUT}.bai:/data/input.bam.bai \
    --bind ${GENOME}:/annotation/reference.fa \
    --bind ${ANNO}:/annotation/anno.gtf \
    --bind ${OUTPUT_FOLDER}:/mnt \
    decoil.sif \
    decoil-pipeline sv-reconstruct \
            -b /data/input.bam \
            -r /annotation/reference.fa \
            -g /annotation/anno.gtf \
            -o /mnt --name ${NAME}


Install Decoil from source

You can install the latest version of Decoil repository (git and conda/mamba required):

git clone https://github.com/madagiurgiu25/decoil-pre.git
cd  decoil-pre

# create conda environment
# for linux
mamba env create -f environment.yml
# for macos
mamba env create -f environment.yml --platform osx-64

conda activate envdecoil
python -m pip install -r requirements.txt
python setup.py install

And check if the installation worked:

# might take a while
decoil-pipeline --version
decoil --version


Decoil configurations

An overview about the available functionalities:

decoil-pipeline decoil decoil-viz
(recommended) (advanced users) (recommended)
SV calling x
coverage track x
reconstruction x x
visualization x
docker x x x
singularity x x x


1. Reconstruct ecDNA using decoil-pipeline (recommended)

To reconstruct ecDNA we recommend to use decoil-pipeline using the sv-reconstruct mode.
This requires only a .bam file as input and generates internally all the files required for the reconstruction.

# call help
docker run -it --platform=linux/amd64 -t madagiurgiu25/decoil:1.1.2-slim decoil-pipeline --help

usage: decoil-pipeline <workflow> <parameters> [<target>]
Example: 
    # run decoil including the processing and visualization steps
    decoil-pipeline -f sv-recontruct --bam <input> --outputdir <outputdir> --name <sample> --sv-caller <sniffles> -r <reference-genome> -g <annotation-gtf>

Decoil 1.1.2: reconstruct ecDNA from long-read data

positional arguments:
  {sv-only,sv-reconstruct,reconstruct-only}
                        sub-command help
    sv-only             Perform preprocessing
    sv-reconstruct      Perform preprocessing and reconstruction

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -n, --dry-run
  -f, --force
  -c, --use-conda

The pipeline has the following running modes:


2. Visualization of ecDNA threads using decoil-viz (recommended)

To interpret and visualize the results of the ecDNA reconstruction threads, use decoil-viz.


3. Reconstruct ecDNA using decoil (advanced users only)

This configuration is the most flexible and allows users to use their own SV calls. For details go here.


FAQ

Check recommendations for filtering or debugging in the FAQ section.


File formats

The relevant output files for the users are:

cat reconstruct.bed

#chr    start   end     circ_id fragment_id     strand  coverage        estimated_proportions
chr2    15585356        15633376        0       5       +       149     75
chr3    11150000        11160001        0       41      -       103     75
chr3    11049997        11060001        0       33      +       117     75
chr2    15585356        15633376        3       5       +       149     36
chr3    11150000        11160001        3       41      -       103     36
chr3    11049997        11060001        3       33      +       117     36
chr2    15585356        15633376        3       5       +       149     36
chr2    16521052        16628305        3       13      +       37      36
chr3    10981202        11028470        3       25      -       31      36
chr12   68807722        68970910        2       53      +       252     252
cat summary.txt

circ_id chr_origin      size(MB)        label   topology_idx    topology_name   estimated_proportions
0       chr3,chr2       0.068025                4       multi_region_inter_chr  75
3       chr3,chr2       0.270566        ecDNA   5       simple_duplications     36
2       chr12           0.163188        ecDNA   0       simple_circle           252


Citation

If you use Decoil for your work please cite our paper:

Madalina Giurgiu, Nadine Wittstruck, Elias Rodriguez-Fos, Rocio Chamorro Gonzalez, Lotte Bruckner, Annabell Krienelke-Szymansky, Konstantin Helmsauer, Anne Hartebrodt, Philipp Euskirchen, Richard P. Koche, Kerstin Haase, Knut Reinert, Anton G. Henssen*. Reconstructing extrachromosomal DNA structural heterogeneity from long-read sequencing data using Decoil. Genome Research 2024, DOI: https://doi.org/10.1101/gr.279123.124

@article{Giurgiu2024ReconstructingDecoil,
    title = {{Reconstructing extrachromosomal DNA structural heterogeneity from long-read sequencing data using Decoil}},
    year = {2024},
    journal = {Genome Research},
    author = {Giurgiu, Madalina and Wittstruck, Nadine and Rodriguez-Fos, Elias and Chamorro Gonzalez, Rocio and Brueckner, Lotte and Krienelke-Szymansky, Annabell and Helmsauer, Konstantin and Hartebrodt, Anne and Euskirchen, Philipp and Koche, Richard P. and Haase, Kerstin and Reinert, Knut and Henssen, Anton G.},
    month = {8},
    pages = {gr.279123.124},
    doi = {10.1101/gr.279123.124},
    issn = {1088-9051}
}

Paper repository: https://github.com/henssen-lab/decoil-paper

License

Decoil is distributed under the BSD 3-Clause license. Consult the accompanying LICENSE file for more details.

Disclaimer

Decoil and the content of this research-repository (i) is not suitable for a medical device; and (ii) is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.