liwenran / DeepTACT

DeepTACT: predicting three-dimensional chromatin contacts via bootstrapping deep learning
19 stars 6 forks source link

DeepTACT

DeepTACT is a bootstrapping deep learning model, which integrates genome sequences and chromatin accessibility data for the prediction of chromatin contacts among regulatory elements.

Data preprocessing

Promoter capture Hi-C data of total B cells (tB), monocytes (Mon), foetal thymus (FoeT), total CD4+ T cells (tCD4), naive CD4+ T cells (nCD4), and total CD8+ T cells (tCD8) are derived from https://osf.io/u8tzp/ (Javierre et al. 2016). Labeled training pairs of these cell lines are available in the DeepTACT/ directory, where each cell line has its own subdirectory. For each cell line, we train and test models for promoter-promoter interactions (P-P) and promoter-enhancer interactions (P-E) separately. The sequence information of each interaction pair is extracted from hg19.fa. The openness signals are calculated from DNase-seq data.

Training and evaluation

We implemented the DeepTACT model using Keras 1.2.0 on a Linux server. All experiments were carried out with 4 Nvidia K80 GPUs which significantly accelerated the training process than CPUs. We provide examples of sequences and DNase inputs in the DeepTACT/demo directory. If you want to train your own model with DeepTACT, you can simply substitute your data to DeepTACT/demo with the same format. To train a model, you can run

python DeepTACT.py demo P-E 3

We evaluate the ensemble model with a voting strategy. Given the information of a sample as an input, its final prediction score is the average of the prediction scores derived from all classifiers.

Application: predict promoter-level interactions from PCHi-C data

We apply the trained DeepTACT model to infer contacts between regulatory elements in situations where one or both interaction regions contain multiple regulatory elements. In this way, we predict promoter-level interactions from bin-level interactions. For each cell line, the promoter-level interactions are predicted and saved in 'predictions.csv' files (e.g. DeepTACT/Mon/P-E/predictions.csv).

Requirements

Installation

Download DeepTACT by

git clone https://github.com/liwenran/DeepTACT

License

This project is licensed under the MIT License - see the LICENSE.md file for details