CSuperlei / DeepSV

Calling deletions using deep convolutional neural
24 stars 8 forks source link

DeepSV

DeepSV

Introduction

DeepSV, an approach based on deep learning for calling long deletions from sequence reads.DeepSV is based on a novel method of visualizing sequence reads. The visualization is designed to capture multiple sources of information in the data that are relevant to long deletions. DeepSV also implements techniques for working with noisy training data. DeepSV trains a model from the visualized sequence reads and calls deletions based on this model. We demonstrate that DeepSV outperforms existing methods in terms of accuracy and efficiency of deletion calling on the data from the 1000 Genomes Project. Our work shows that deep learning can potentially lead to effective calling of different types of genetic variations that are complex than SNPs. WorkFlow

Requirements

Installation

Tools

bash Anaconda3-4.3.1-Linux-x86_64.sh

Jupyter Notebook

Cuda & cudnn

Installation tutorial can be downloaded from the official website

TensorFlow

Digits

cd ~
git clone https://github.com/NVIDIA/DIGITS.git digits
cd digits
sudo apt-get install graphviz gunicorn
for req in $(cat requirements.txt); do sudo pip install $req; done 
pip install -r ~/digits/requirements.txt 
./digits-devserver

pysam

Usage

Data

BAM file & VCF file
First provide the bam files and vcf files for program

Generation Candidates

Run Generate_Deletion_Image.py and Generate_Non_Deletion_Image.py in the custom path

Geerationg Images Path

Generate the path of all pictures for training the network

Using Digits training CNN

Send all the generated pictures to the network training

Using a trained network for calling deletion

Generating whole genome pictures

Extracting deletion information from test results

Generating VCF File