dozmorovlab / TADCompare

Package for analysis and characterization of differential TADs
https://dozmorovlab.github.io/TADCompare/
Other
22 stars 2 forks source link

Errors when working with .cool files and HiC-Pro files #19

Open elinefurz opened 11 months ago

elinefurz commented 11 months ago

Hello.

I am trying to use TADCompare on my two Hi-C matrices, but have run in to some errors when using your guide points from section 3.6 and 3.7 (https://www.bioconductor.org/packages/release/bioc/vignettes/TADCompare/inst/doc/Input_Data.html).

First, I tried using my cool-files, but when this did not work i tried with my output-files from HiC-Pro as this is the program used to process my Hi-C data.

I am very new to this, and might be overlooking something. Do you have any advice for me?

This is my code:

Working with .cool files

Read in data

cool_mat_CC <- read.table("CC03.40000_balanced.txt") cool_mat_TT <- read.table("TT20.40000_balanced.txt")

Convert to sparse 3-column matrix using cooler2sparse from HiCcompare

sparse_mat_CC <- HiCcompare::cooler2sparse(cool_mat_CC) sparse_mat_TT <- HiCcompare::cooler2sparse(cool_mat_TT)

Run TADCompare

diff_tads = lapply(names(sparse_mat_CC), function(x) { TADCompare(sparse_mat_CC[[x]], sparse_mat_TT[[x]], resolution = 40000) })

Error in colSums(cont_mat1) : 'x' must be an array of at least two dimensions In addition: Warning messages: 1: In mean.default(point_dists1, na.rm = TRUE) : argument is not numeric or logical: returning NA 2: In mean.default(point_dists2, na.rm = TRUE) : argument is not numeric or logical: returning NA 3: In mean.default(point_dists1, na.rm = TRUE) : argument is not numeric or logical: returning NA 4: In mean.default(point_dists2, na.rm = TRUE) : argument is not numeric or logical: returning NA 5: In mean.default(point_dists1, na.rm = TRUE) : argument is not numeric or logical: returning NA 6: In mean.default(point_dists2, na.rm = TRUE) : argument is not numeric or logical: returning NA 7: In mean.default(point_dists1, na.rm = TRUE) : argument is not numeric or logical: returning NA 8: In mean.default(point_dists2, na.rm = TRUE) : argument is not numeric or logical: returning NA

Working with Hic-Pro files

Read in both files

mat_CC <- read.table("CC03.40000_iced.matrix") bed_CC <- read.table("CC03.40000_abs.bed")

Matrix 2

mat_TT <- read.table("TT20.40000_iced.matrix") bed_TT <- read.table("TT20.40000_abs.bed")

Convert to modified bed format

sparse_mats_CC <- HiCcompare::hicpro2bedpe(mat_CC,bed_CC) sparse_mats_TT <- HiCcompare::hicpro2bedpe(mat_TT,bed_TT)

Remove empty matrices if necessary

sparse_mats$cis = sparse_mats$cis[sapply(sparse_mats, nrow) != 0]

Go through all pairwise chromosomes and run TADCompare

sparse_tads = lapply(1:length(sparse_mats_CC$cis), function(z) { x <- sparse_mats_CC$cis[[z]] y <- sparse_mats_TT$cis[[z]]

Pull out chromosome

chr <- x[, 1][1]

Subset to make three column matrix

x <- x[, c(2, 5, 7)] y <- y[, c(2, 5, 7)]

Run SpectralTAD

comp <- TADCompare(x, y, resolution = 40000) return(list(comp, chr)) }) Error in TADCompare(x, y, resolution = 40000) : Matrix 1 is too small to convert to full

mdozmorov commented 11 months ago

Hi @elinefurz, thanks for using TADcompare. The cool format has changed, so issues are expected. It is on our list to investigate.

As for HiC-Pro - can you share subsets of the two files that I can use to reproduce?