A Tensorflow DL framework for predicting Hi-C chromatin interactions using megabase scale DNA sequence.
This repository contains the core deepC python code, R scripts and functions for downstream analysis as well as tutorials and links to example data.
The core code is implemented in python (v3.5+) and tensorflow (v1). For downstream analysis and visualizations we use R and custom functions for handling HiC data and deepC predictions.
python 3.5 +
tensorflow (tensorflow-gpu)
additional python modules:
R version 3.4.4 +
some processing helper scripts require perl (v5.26.0 or above)
Make sure python 3.5-3.7 as supported by tensorflow is installed.
Install tensorflow preferably with GPU support.
Install additional python library (pysam and pybedtools) using e.g. pip or bioconda
pip install pybedtools
pip install pysam
Clone the deepC github repository
Check which version of tensorflow you have installed and choose the appropriate compatibility version of deepC
tensorflow version | CUDA version | deepC version |
---|---|---|
2.1+ | 10.1 | tensorflow2.1plus_compatibility_version |
2.0 | 10 | tensorflow2.0_compatibility_version* |
1.14 | 10 | tensorflow1_version |
1.8 | 9 | legacy_version_tf1.8 |
*Compatibility with v2.0 not yet tested.
Clone the repository. Make sure all dependencies are available.
To use from within a python script import as import deepCregr
.
Find tutorials here.
Download links to trained models are provided under ./models
. See the README
file there for details.
Please refer to the Nature Methods article here
Implementation of dilated convolutions was adapted from wavenet.