MWSchmid / HiCdat

Hi-C data analysis tool
GNU General Public License v3.0
13 stars 5 forks source link

HiCdat: Hi-C data analysis tool

About HiCdat

User guide including a tutorial for data pre-processing:

user guide

Binaries (note that the MacOSX binary was built on 10.10.1 and in addition tested on 10.8.5):

windows 7 64bit binary

MacOSX 64bit binary

linux 64bit binary, IMPORTANT: install the Qt4 libraries: sudo apt-get install qt4-default

Data for pre-processing tutorial:

pre-processing data (full data set, ~20 Gb)

pre-processing data (reduced data set, ~5 Gb)

IMPORTANT: The function for the correlated HiC interaction matrix is slightly different to the one we used in the original article on the KNOT in Arabidopsis. The color gradient is less bright and it is reverted: yellow/orange means low/negative correlation and red means high/positive correlation. My thanks to Dr. Syed Islamuddin Shah for reporting this difference. You can check the "new" color gradient with:

colorSet <- colorRampPalette(c("#ffeda0", "#feb24c", "#f03b20"), interpolate = "spline")(64)
plot(c(1:64),rep(1,64),col = colorSet, type = 'h', lwd = 10)

For the tutorial in R, download the package and the archives below, unpack the two archives, and open "HiCdat-tutorial-arabidopsis.R" in a text editor and follow the instructions. The HiCdat package HiCdatR can be installed with install.packages("/path/to/HiCdatR_0.99.0.tar.gz", repos=NULL, type = "source"). Note that HiCdatR requires the R libraries "gplots", "randomizeBE", "MASS", and "HiCseg". You can install them with:

install.packages(c("gplots", "randomizeBE", "MASS"))
source("http://bioconductor.org/biocLite.R")
biocLite("HiCseg")

install.packages("/path/to/HiCdatR_0.99.0.tar.gz", repos=NULL, type = "source")

R-package

R-Scripts (including the R-tutorial script)

files required for the tutorial in R

If you encounter problems, please contact me.

NOTE: if you encounter problems installing one of the R packages (other than HiCdatR), try to install it via Bioconductor:

source("http://bioconductor.org/biocLite.R")
biocLite("insertNameOfPackageHere")

NOTE: On linux, GLIBC needs to be at least version 2.14. Biolinux6 has a lower version.

NOTE: Ay and Noble (2015) list Subread as aligner for HiCdat in their table. While we do use Subread in the Tutorial, we also mention that any other aligner should in principle work with HiCdat. The only requirement is that the aligned reads are in BAM format. For Bowtie2, this can for example done directly with:

bowtie2 --no-unal -x myIndex -U sample.fastq.gz | samtools view -q 10 -h -b - > sample_minQ_10.bam
# -q 10 means that only alignments with a minimal quality of 10 are kept
# this should be sufficient to remove all "non-unique" alignments