RegulatoryGenomicsGroup / chicdiff

A differential caller for capture Hi-C data
4 stars 3 forks source link

Internal error: Failed to allocate counts or TMP when assigning g in gforce #5

Closed maharshi14 closed 5 years ago

maharshi14 commented 5 years ago

Hi!

My Chicdiff script is as follows:

library("Chicdiff")

setwd("scratch/Chicdiff/")

designDir <- file.path("/home/s1786487/scratch/Chicdiff/DesignDir")

countData <- list( growing = c(Growing.Rep1 = file.path("/home/s1786487/scratch/Chicdiff/Growing.Rep1/chicagoTools.downsampled.chinput"), Growing.Rep2 = file.path("/home/s1786487/scratch/Chicdiff/Growing.Rep2/chicagoTools.downsampled.chinput") ),

senesence = c(Senescent.Rep1 = file.path("/home/s1786487/scratch/Chicdiff/Senescent.Rep1/chicagoTools.downsampled.chinput"), Senescent.Rep2 = file.path("/home/s1786487/scratch/Chicdiff/Senescent.Rep2/chicagoTools.downsampled.chinput") ) )

chicagoData <- list( growing = c(Growing.Rep1 = file.path("/home/s1786487/scratch/Chicdiff/Growing.Rep1/Growing.Rep1.Rds"), Growing.Rep2 = file.path("/home/s1786487/scratch/Chicdiff/Growing.Rep2/Growing.Rep2.Rds") ),

senesence = c(Senescent.Rep1 = file.path("/home/s1786487/scratch/Chicdiff/Senescent.Rep1/Sen.Rep1.Rds"), Senescent.Rep2 = file.path("/home/s1786487/scratch/Chicdiff/Senescent.Rep2/Sen.Rep2.Rds") ) )

peakFiles_growing <- "/home/s1786487/scratch/Chicdiff/PeakFiles/Growing_merged-matrix.txt" peakFiles_senescent <- "/home/s1786487/scratch/Chicdiff/PeakFiles/Senescent_merged-matrix.txt"

chicdiff.settings <- setChicdiffExperiment(designDir = designDir, chicagoData = chicagoData, countData = countData, peakfiles = c(peakFiles_growing,peakFiles_senescent), #not sure if listing multiple peak files in a vector is the problem here or not outprefix="Chicdiff_Growing-vs-Senescent", settings = list(parallel=F))

output <- chicdiffPipeline(chicdiff.settings)

After the following standard outputs:

*** Running getRegionUniverse

*** Running getControlRegionUniverse

*** Running getFullRegionData

Reading data for significant interactions

Reading Chicago dataset 1 of 4 : growing.Growing.Rep1 Collecting Bmean and Tmean Recalculating distSign Collecting bait parameters (s_j and tblb) Collecting other end parameters (s_i and tlb) Collecting Tmean Imputing missing Tmeans Reconstructing the distance function Reconstructing Bmean

Reading Chicago dataset 2 of 4 : growing.Growing.Rep2 Collecting Bmean and Tmean Recalculating distSign Collecting bait parameters (s_j and tblb) Collecting other end parameters (s_i and tlb) Collecting Tmean Imputing missing Tmeans Reconstructing the distance function Reconstructing Bmean

Reading Chicago dataset 3 of 4 : senesence.Senescent.Rep1 Collecting Bmean and Tmean Recalculating distSign Collecting bait parameters (s_j and tblb) Collecting other end parameters (s_i and tlb) Collecting Tmean Imputing missing Tmeans Reconstructing the distance function Reconstructing Bmean

Reading Chicago dataset 4 of 4 : senesence.Senescent.Rep2 Collecting Bmean and Tmean Recalculating distSign Collecting bait parameters (s_j and tblb) Collecting other end parameters (s_i and tlb) Collecting Tmean Imputing missing Tmeans Reconstructing the distance function Reconstructing Bmean

Saving counts

Reading count data for growing.Growing.Rep1 -------------------------------------------------- ================================================== Reading count data for growing.Growing.Rep2
==================================================
Reading count data for senesence.Senescent.Rep1 -------------------------------------------------- ================================================== Reading count data for senesence.Senescent.Rep2
==================================================

Collecting distances for CountOut Processing count data

Reading data for control interactions

Reading Chicago dataset 1 of 4 : growing.Growing.Rep1 Collecting Bmean and Tmean Recalculating distSign Collecting bait parameters (s_j and tblb) Collecting other end parameters (s_i and tlb)

I get the following error:

Error in gforce(thisEnv, jsub, o, f, len__, irows) : Internal error: Failed to allocate counts or TMP when assigning g in gforce

Can't seem to troubleshoot it so far, any ideas on the possible causes?

Thanks!

mspivakov commented 5 years ago

This looks like a data.table issue. Could it be that you've run out of memory?

maharshi14 commented 5 years ago

Thank you so much for the error, indeed it was a memory issue.

It ran perfectly until the IHWcorrection, where it fails to save the plots made for the unweighted p-values decision boundaries, with the error:

Error: 'ggsave' is not an exported object from 'namespace:cowplot'

I have tried using both ggplot2 and cowplot libraries attached and then not attached, but the error is still the same.

Any help on this would be much appreciated, thanks!

maharshi14 commented 5 years ago

Okay so I realised this was an issue already, I used trace to edit the IHWcorrection code, let's see if it works now!

Thanks!

mspivakov commented 5 years ago

Thanks for the heads up, we will have a look why this issue pops up again. Just to double check - are you using the latest versions of cowplot and ggplot2?

mspivakov commented 5 years ago

We couldn't reproduce this error, so can you please post your sessionInfo()?

maharshi14 commented 5 years ago

So as I mentioned I used trace to add the if loop from the last thread

if (exists("ggsave", where="package:cowplot")){ cowplot::ggsave(paste0(outprefix, "_diffbaitPlot",".",device), device = device, path = "./") } else{ }

But it gave me the same error again. My cowplot version is 1.0.0 and ggplot2 version is 3.2.0.

Here's the sesssionInfo():

R version 3.6.0 (2019-04-26) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Scientific Linux 7.2 (Nitrogen)

Matrix products: default BLAS: /gpfs/igmmfs01/software/pkg/el7/apps/R/3.6.0/lib64/R/lib/libRblas.so LAPACK: /gpfs/igmmfs01/software/pkg/el7/apps/R/3.6.0/lib64/R/lib/libRlapack.so

Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding

locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] ggplot2_3.2.0 cowplot_1.0.0 Chicdiff_0.4 S4Vectors_0.22.1 BiocGenerics_0.30.0 Chicago_1.12.0
[7] data.table_1.12.2

loaded via a namespace (and not attached): [1] Biobase_2.45.0 IHW_1.12.0 bit64_0.9-7 splines_3.6.0
[5] Formula_1.2-3 assertthat_0.2.1 latticeExtra_0.6-28 blob_1.2.0
[9] GenomeInfoDbData_1.2.1 slam_0.1-45 yaml_2.2.0 pillar_1.4.2
[13] RSQLite_2.1.2 backports_1.1.4 lattice_0.20-38 glue_1.3.1
[17] digest_0.6.20 GenomicRanges_1.36.0 RColorBrewer_1.1-2 XVector_0.24.0
[21] checkmate_1.9.4 colorspace_1.4-1 htmltools_0.3.6 Matrix_1.2-17
[25] DESeq2_1.24.0 XML_3.98-1.20 pkgconfig_2.0.2 genefilter_1.66.0
[29] zlibbioc_1.30.0 purrr_0.3.2 xtable_1.8-4 scales_1.0.0
[33] fdrtool_1.2.15 BiocParallel_1.18.0 htmlTable_1.13.1 tibble_2.1.3
[37] annotate_1.62.0 IRanges_2.18.1 withr_2.1.2 SummarizedExperiment_1.14.0 [41] hexbin_1.27.3 nnet_7.3-12 lazyeval_0.2.2 survival_2.44-1.1
[45] magrittr_1.5 crayon_1.3.4 memoise_1.1.0 MASS_7.3-51.4
[49] foreign_0.8-71 tools_3.6.0 matrixStats_0.54.0 stringr_1.4.0
[53] locfit_1.5-9.1 munsell_0.5.0 cluster_2.1.0 DelayedArray_0.10.0
[57] AnnotationDbi_1.46.0 compiler_3.6.0 GenomeInfoDb_1.20.0 rlang_0.4.0
[61] grid_3.6.0 RCurl_1.95-4.12 rstudioapi_0.10 Delaporte_7.0.2
[65] htmlwidgets_1.3 bitops_1.0-6 base64enc_0.1-3 FMStable_0.1-2
[69] gtable_0.3.0 DBI_1.0.0 R6_2.4.0 gridExtra_2.3
[73] knitr_1.23 dplyr_0.8.3 zeallot_0.1.0 bit_1.1-14
[77] Hmisc_4.2-0 lpsymphony_1.12.0 stringi_1.4.3 Rcpp_1.0.2
[81] harmonicmeanp_2.0 geneplotter_1.62.0 vctrs_0.2.0 rpart_4.1-15
[85] acepack_1.4.1 tidyselect_0.2.5 xfun_0.8

vmalysheva commented 5 years ago

Hi,

You have got an older version of Chicdiff. The current is 0.5 :)

maharshi14 commented 5 years ago

Hi,

Thanks for letting me know, but shouldn't it work on version 0.4 as well? I corrected the IHWcorrection function as follows:

body(IHWcorrection)

{ baitmapfile = chicdiff.settings[["baitmapfile"]] device = chicdiff.settings[["device"]] outprefix = chicdiff.settings[["outprefix"]] out <- DESeqOut RU.recast <- FullRegionData RU.distances <- RU.recast[, list(avDist = mean(distSign)), by = "regionID"] out$avDist <- RU.distances$avDist out$uniform <- runif(nrow(out)) out$shuff <- sample(out$pvalue) message("Comparison against p-vals for out") setDT(out) out.control <- DESeqOutControl RU.recastControl <- FullControlRegionData RU.distancesControl <- RU.recastControl[, list(avDist = mean(distSign)), by = "regionID"] out.control$avDist <- RU.distancesControl$avDist out.control$uniform <- runif(nrow(out.control)) out.control$shuff <- sample(out.control$pvalue) message("Comparison against p-vals for outcontrol") as.data.frame(out.control) ihwRes <- ihw(pvalue ~ abs(avDist), data = out.control, alpha = 0.05) message("Trained weights on the control sample") if (DiagPlot == TRUE) { plot(ihwRes) cowplot::ggsave(paste0(outprefix, "_IHWweightPlot.png"), device = "png", path = "./") dev.off() plot(ihwRes, what = "decisionboundary") cowplot::ggsave(paste0(outprefix, "_IHWdecisionBoundaryPlot.png"), device = "png", path = "./") dev.off() } test <- ihwRes@df setDT(test) test[, log(covariate), by = "group"] distLookup <- test[, list(avgLogDist = mean(log(covariate)), minLogDist = min(log(covariate)), maxLogDist = max(log(covariate))), by = "group"] distLookup <- distLookup[!is.na(group), ] setkey(distLookup, group) if (distLookup[, !identical(as.integer(group), seq_along(group))]) { stop("Assumption violated") } w <- ihwRes@weights distLookup[, :=(group, as.integer(group))] distLookup$avWeights <- rowSums(w)/ncol(w) distLookup$minLogDist[1] = 0 distLookup$maxLogDist[nrow(distLookup)] = Inf message("Learned distance dependency") out[, :=(avgLogDist, log(abs(avDist)))] breaks <- (c(distLookup$minLogDist, Inf) + c(0, distLookup$maxLogDist))/2 out$group <- as.integer(cut(log(abs(out$avDist)), breaks)) setkey(distLookup, group) setkey(out, group) out <- merge(out, distLookup[, c("group", "avWeights"), with = FALSE], all.x = TRUE, by = "group") out$weight <- out$avWeights/mean(out$avWeights) out[, :=(weighted_pvalue, pvalue/weight)] out[, :=(weighted_padj, p.adjust(weighted_pvalue, method = "BH"))] message("applied to test data") if (diffbaitPlot == TRUE) { sel <- order(out$weighted_padj) baits <- sample(head(unique(out[sel]$baitID), 100), 4) plotDiffBaits(output = out, countput = countput, baitmapfile = baitmapfile, baits = baits) if (exists("ggsave", where = "package:cowplot")) { cowplot::ggsave(paste0(outprefix, "_diffbaitPlot", ".", device), device = device, path = "./") } else { cowplot::ggsave2(paste0(outprefix, "_diffbaitPlot", ".", device), device = device, path = "./") } dev.off() } saveRDS(out, paste0(outprefix, "_results", suffix, ".Rds")) return(out) }

Do you mind telling me where do I need to make changes here or anywhere else to make it work? I will wait until the new version can be installed by our administrator.

Thanks!

vmalysheva commented 5 years ago

Thanks for proposing this correction. Sadly, it doesn't work like that, because for where="package:cowplot" to work, you need cowplot already 'requested for use' before. However, at that point in Chicdiff, the cowplot was not called yet and thus where="package:cowplot" will not work.

One way to mitigate this issue is to straightaway go for: cowplot::ggsave2() without the if else.

The version 0.5 is essentially doing that.

Also there are two other instances prior the place where you inserted the 'ifelse', where the cowplot::ggsave was used. For control ihw plots - we simply replaced it for ggsave (without cowplot), which is coming form ggplot2.

However for the plotDiffbaits we need the cowplot's ggsave2. So, as I mentioned before you can just do cowplot::ggsave2() and this should work :)