Closed Phlya closed 6 years ago
You can save an R object by using the save()
function like this: save(hic.table, file = 'path/to/file.RDA')
And then load the object with load(file = 'path/to/file.RDA)
. Or you should be able to write it as a text file using write.table(hic.table, file = "path/to/file.txt")
. If write.table
is not working can you print the top few rows of the object you are trying to save and paste them here so I can see?
The following should work: write.csv(do.call(rbind, hic.table), "hic_table.csv")
Hi, thanks for your comments! I feel like I am really not getting something... This is my script:
#!/usr/bin/env Rscript
library(HiCcompare)
args = commandArgs(trailingOnly=TRUE)
dat1 <- read.table(args[1], header=FALSE, col.names=c("chr1", "start1", "end1", "chr2", "start2", "end2", "IF"))
dat2 <- read.table(args[2], header=FALSE, col.names=c("chr1", "start1", "end1", "chr2", "start2", "end2", "IF"))
hic.list <- create.hic.table(dat1, dat2)
hic.list <- total_sum(hic.list)
hic.list <- hic_loess(hic.list, Plot=TRUE, Plot.smooth=FALSE, parallel=TRUE)
hic.list <- hic_compare(hic.list, A.min = NA, adjust.dist = TRUE, p.method = 'fdr', Plot = TRUE)
head(hic.list)
write.csv(do.call(rbind, hic.list), args[3])
I run it on a cluster like this:
#!/bin/sh
# Grid Engine options (lines prefixed with #$)
#$ -cwd
#$ -N HiCCompare
#$ -l h_rt=06:00:00
#$ -l h_vmem=32G
#$ -pe sharedmem 4
#$ -V
Rscript 03_HiCcompare.R $1 $2 $3
qsub 03_launch_HiCcompare.sh ../pixels/serum-KM.1000000.pixels.tsv ../pixels/2i.1000000.pixels.tsv ../pixels/serum_vs_2i_hiccompare_1Mb.tsv
This is the output I get from stderr:
Loading required package: dplyr
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Warning messages:
1: In sum(hic.table$IF2) : integer overflow - use sum(as.numeric(.))
2: In sum(hic.table$IF1) : integer overflow - use sum(as.numeric(.))
Error in data.table::rbindlist(hic.list) :
Item 1 of list input is not a data.frame, data.table or list
Calls: total_sum -> <Anonymous>
Execution halted
What am I doing wrong?..
Are the sparse matrices that you are reading in for a single chromosome or did you combine all the chromosomes into a single file? HiCcompare needs the input to be a separate matrix for each chromosome, i.e if you are trying to compare the entire human genome you shave have 23 separate objects for each group. I think from your code you are only comparing a single pair of matrices so you should not be using the total_sum
function. Additionally there is no need to rbindlist
if you're only using data for a single chromosome. If you do want to look at all the chromosomes your script should look more like this:
# read your data, assuming the paths to the files are stored in group1_filenames and group2_filenames
group1 <- list()
group2 <- list()
for (i in 1:23) {
group1[[i]] <- read.table(group1_filenames[i], header=FALSE, col.names=c("chr1", "start1", "end1", "chr2", "start2", "end2", "IF"))
group2[[i]] <- read.table(group2_filenames[i], header=FALSE, col.names=c("chr1", "start1", "end1", "chr2", "start2", "end2", "IF"))
}
# now create the hic.table objects
hic.list <- mapply(create.hic.table, group1, group2, SIMPLIFY = FALSE)
# set up parallelization
library(BiocParallel)
# Check how many cores you have
numCores <- parallel::detectCores()
numCores
# Set the number of cores at least one less than the total number
register(MulticoreParam(workers = numCores - 1), default = TRUE)
# normalize
hic.list <- total_sum(hic.list)
hic.list <- hic_loess(hic.list, parallel=TRUE)
# compare
hic.list <- hic_compare(hic.list, A.min = NA, adjust.dist = TRUE, p.method = 'fdr', Plot = TRUE, parallel = TRUE)
# save
write.csv(rbindlist(hic.list), file = "path/to/file.csv")
There are more examples in the vignette which you can access using browseVignettes("HiCcompare")
.
It is whole genome data. I thought I read somewhere that create.hic.table
automatically creates a list of tables if input has multiple chromosomes?..
That is the problem then. Currently create.hic.table
doesn't split it up into a list if you input multiple chromosomes. You can work around this by just splitting your data up by chromosome before inputting it into create.hic.table
.
# split the data by chromosome
dat1 <- split(dat1, dat1$chr1)
dat2 <- split(dat2, dat2$chr1)
# now use mapply to create a list of hic.tables
hic.list <- mapply(create.hic.table, dat1, dat2, SIMPLIFY = FALSE)
Brilliant, thanks! And is it OK if the data contain inter-chromosomal interactions?
No, HiCcompare
only works on the intra-chromosomal interactions. You should remove any rows of your data where chr1 != chr2
before inputting it into HiCcompare
.
Thanks a lot, it's working now!
For future reference, this is my script:
library(HiCcompare)
library(BiocParallel)
args = commandArgs(trailingOnly=TRUE)
dat1 <- read.table(args[1], header=FALSE, col.names=c("chr1", "start1", "end1", "chr2", "start2", "end2", "IF"))
dat2 <- read.table(args[2], header=FALSE, col.names=c("chr1", "start1", "end1", "chr2", "start2", "end2", "IF"))
dat1 <- dat1[dat1$chr1==dat1$chr2, ]
dat2 <- dat2[dat2$chr1==dat2$chr2, ]
dat1 <- split(dat1, dat1$chr1)
dat2 <- split(dat2, dat2$chr1)
hic.list <- mapply(create.hic.table, dat1, dat2, SIMPLIFY = FALSE, scale=FALSE)
hic.list <- total_sum(hic.list)
register(MulticoreParam(workers = 3), default = TRUE)
hic.list <- hic_loess(hic.list, Plot=TRUE, Plot.smooth=FALSE, parallel=TRUE)
hic.list <- hic_compare(hic.list, A.min = NA, adjust.dist = TRUE, p.method = 'fdr', Plot = TRUE, parallel=TRUE)
hic.list <- do.call(rbind, hic.list)
hic.list <- hic.list[hic.list$p.adj<0.05,]
write.table(hic.list, args[3])
Thanks, @Phlya, glad it worked. @jstansfield0, please, add the script to the vignette as an example how to run HiCcompare on a cluster, acknowledge Ilya. Then, close the issue. Thanks!
Hi, how can I actually save the results in the end? I never use R, so this might be a stupid question...
I am trying to apply HiCcompare genome-wide and therefore have a list of tables. I thought perhaps I can just
rbindlist
it all together andwrite.table
, but this doesn't work:Item 1 of list input is not a data.frame, data.table or list