JinLabBioinfo / DeepLoop

DeepLoop robustly identifies enhancer-promoter interactions from low-depth and single-cell Hi-C data
Other
28 stars 1 forks source link

DeepLoop

The conceptual innovation of DeepLoop is to handle systematic biases and random noises separately: we used HiCorr to improve the rigor of bias correction, and then applied deep-learning techniques for noise reduction and loop signal enhancement. DeepLoop significantly improves the sensitivity, robustness, and quantitation of Hi-C loop analyses, and can be used to reanalyze most published low-depth Hi-C datasets.

Citation: Zhang, S., Plummer, D., Lu, L. et al. DeepLoop robustly maps chromatin interactions from sparse allele-resolved or single-cell Hi-C data at kilobase resolution. Nat Genet 54, 1013–1025 (2022). https://doi.org/10.1038/s41588-022-01116-w

DeepLoop contains two parts:

Processed data availability

Installation

DeepLoop was developed and tested using Python 3.5 and following Python packages:

The packages can be installed by running the following command: pip3 install -r requirements.txt This will also install optional visualization and analysis tools we use such as:

If you plan on training your own model you will want to use a GPU enabled version of TensorFlow to intractably long training times. We used tensorflow-gpu==2.3.1 but any TF2 version should work. For prediction GPU is not necessary but it will be faster than using CPU.

Download DeepLoop trained models and reference files

cd DeepLoop/
wget --no-check-certificate https://hiview.case.edu/ssz20/tmp.HiCorr.ref/DeepLoop_models.tar.gz
tar -xvf DeepLoop_models.tar.gz

After decompressing, the "DeepLoop_models/" dircetory includes "CPGZ_trained", "H9_trained" models and "ref" which includes anchor bed files for HiCorr output.

Run DeepLoop

There are three steps to process Hi-C data from fastq files:

chr=chr11
start=130000000
end=130800000
outplot="./test"
./DeepLoop/lib/generate.matrix.from_HiCorr.pl DeepLoop/DeepLoop_models/ref/hg19_HindIII_anchor_bed/$chr.bed $HiCorr_path/anchor_2_anchor.loop.$chr $chr $start $end ./${chr}_${start}_${end}
./DeepLoop/lib/generate.matrix.from_DeepLoop.pl DeepLoop/DeepLoop_models/ref/hg19_HindIII_anchor_bed/$chr.bed $DeepLoop_outPath/$chr.denoised.anchor.to.anchor $chr $start $end ./${chr}_${start}_${end}
./DeepLoop/lib/plot.multiple.r $outplot 1 3 ${chr}_${start}_${end}.raw.matrix ${chr}_${start}_${end}.ratio.matrix ${chr}_${start}_${end}.denoise.matrix
https://github.com/JinLabBioinfo/DeepLoop/blob/master/images/test.plot.png

Check the "test.plot.png", "raw", "HiCorr", and "DeepLoop"
sample heatmaps

Note:

Heatmap Visualization for HiCorr and DeepLoop output

The heatmap visualization in Step3 above can be also done with script "plot.sh" in "lib/"
It takes eight parameters:

If DeepLoop is installed in home directory "$myhome", outPath is current directory("./") you plan to run the script

bash $myhome/DeepLoop/lib/plot.sh $myhome \
                                  $myhome/DeepLoop/DeepLoop_models/ref/hg19_HindIII_anchor_bed/ \
                                  $HiCorr_path/ \
                                  $DeepLoop_outPath/ \
                                  chr11 130000000 130800000 ./ 

The heatmap png file named "chr11_130000000_130800000.plot.png" will be in the current directory.

Compatible with HiC-Pro, Cooler and HiGlass

Training new models

If you wish to train a new model, ensure you have access to a machine with a GPU and refer to the training walkthrough notebook

About Loopcalling

DeepLoop is able to generate clean loop signals, we We will merge DeepLoop output from all the chromosomes and rank anchor pairs by "LoopStrength"(3rd column). Take confident loops from top ranked contact pairs.