r3fang / SnapATAC

Analysis Pipeline for Single Cell ATAC-seq
GNU General Public License v3.0
301 stars 125 forks source link

Unable to run addBmatToSnap - aborts R session #89

Closed ccruizm closed 4 years ago

ccruizm commented 5 years ago

Good day!

I have not had issues previously running snapATAC but recently, after a recent update, I am having issues running addBmatToSnap. Once I have the x.sp object and try to run the function the next error shows up:

 *** caught segfault ***
address 0x7ffff304bbf8, cause 'memory not mapped'

Traceback:
 1: H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,     compoundAsDataFrame = compoundAsDataFrame, drop = drop, ...)
 2: doTryCatch(return(expr), name, parentenv, handler)
 3: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 4: tryCatchList(expr, classes, parentenv, handlers)
 5: tryCatch({    obj <- H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile,         h5spaceMem = h5spaceMem, compoundAsDataFrame = compoundAsDataFrame,         drop = drop, ...)}, error = function(e) {    err <- h5checkFilters(h5dataset)    if (nchar(err) > 0)         stop(err, call. = FALSE)    else stop(e)})
 6: h5readDataset(h5dataset, index = index, start = start, stride = stride,     block = block, count = count, compoundAsDataFrame = compoundAsDataFrame,     drop = drop, ...)
 7: h5read(file, paste("AM", bin.size, "binChrom", sep = "/"))
 8: doTryCatch(return(expr), name, parentenv, handler)
 9: tryCatchOne(expr, names, parentenv, handlers[[1L]])
10: tryCatchList(expr, classes, parentenv, handlers)
11: tryCatch(binChrom <- h5read(file, paste("AM", bin.size, "binChrom",     sep = "/")), error = function(e) {    stop(paste("Warning @readaddBmatSnap: 'AM/bin.size/binChrom' not found in ",         file))})
12: readBins(x, bin.size = bin.size)
13: FUN(X[[i]], ...)
14: lapply(fileList, function(x) {    readBins(x, bin.size = bin.size)})
15: addBmatToSnap.default(x.sp, bin.size = 1000)
16: addBmatToSnap(x.sp, bin.size = 1000)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

I am using R inside a conda environment. I have this issue using R version 3.5.1 and 3.6.1

I have made a fresh installation of R in an independent environment and then installed snapATAC from scratch but still continue with the same problem. I am running only one dataset with 5000 cells and I am working on an HPC so memory isn't a problem. I have used the package before but as I said before, did not have this issue. What do you think the problem might be?

Thanks in advance

r3fang commented 5 years ago

what's the error message? how many cells/snap files are you analyzing?

ccruizm commented 5 years ago

By the way, when I use a jupyter notebook to run R the error extends more and immediatly kills the kernel:

*** caught segfault ***
address 0x7fff32d25a08, cause 'memory not mapped'

Traceback:
 1: H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem,     compoundAsDataFrame = compoundAsDataFrame, drop = drop, ...)
 2: doTryCatch(return(expr), name, parentenv, handler)
 3: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 4: tryCatchList(expr, classes, parentenv, handlers)
 5: tryCatch({    obj <- H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile,         h5spaceMem = h5spaceMem, compoundAsDataFrame = compoundAsDataFrame,         drop = drop, ...)}, error = function(e) {    err <- h5checkFilters(h5dataset)    if (nchar(err) > 0)         stop(err, call. = FALSE)    else stop(e)})
 6: h5readDataset(h5dataset, index = index, start = start, stride = stride,     block = block, count = count, compoundAsDataFrame = compoundAsDataFrame,     drop = drop, ...)
 7: h5read(file, paste("AM", bin.size, "binChrom", sep = "/"))
 8: doTryCatch(return(expr), name, parentenv, handler)
 9: tryCatchOne(expr, names, parentenv, handlers[[1L]])
10: tryCatchList(expr, classes, parentenv, handlers)
11: tryCatch(binChrom <- h5read(file, paste("AM", bin.size, "binChrom",     sep = "/")), error = function(e) {    stop(paste("Warning @readaddBmatSnap: 'AM/bin.size/binChrom' not found in ",         file))})
12: readBins(x, bin.size = bin.size)
13: FUN(X[[i]], ...)
14: lapply(fileList, function(x) {    readBins(x, bin.size = bin.size)})
15: addBmatToSnap.default(x.sp, bin.size = 1000)
16: addBmatToSnap(x.sp, bin.size = 1000)
17: eval(expr, envir, enclos)
18: eval(expr, envir, enclos)
19: withVisible(eval(expr, envir, enclos))
20: withCallingHandlers(withVisible(eval(expr, envir, enclos)), warning = wHandler,     error = eHandler, message = mHandler)
21: doTryCatch(return(expr), name, parentenv, handler)
22: tryCatchOne(expr, names, parentenv, handlers[[1L]])
23: tryCatchList(expr, classes, parentenv, handlers)
24: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        sm <- strsplit(conditionMessage(e), "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && isTRUE(getOption("show.error.messages"))) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
25: try(f, silent = TRUE)
26: handle(ev <- withCallingHandlers(withVisible(eval(expr, envir,     enclos)), warning = wHandler, error = eHandler, message = mHandler))
27: timing_fn(handle(ev <- withCallingHandlers(withVisible(eval(expr,     envir, enclos)), warning = wHandler, error = eHandler, message = mHandler)))
28: evaluate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos,     debug = debug, last = i == length(out), use_try = stop_on_error !=         2L, keep_warning = keep_warning, keep_message = keep_message,     output_handler = output_handler, include_timing = include_timing)
29: evaluate(request$content$code, envir = .GlobalEnv, output_handler = oh,     stop_on_error = 1L)
30: doTryCatch(return(expr), name, parentenv, handler)
31: tryCatchOne(expr, names, parentenv, handlers[[1L]])
32: tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
33: doTryCatch(return(expr), name, parentenv, handler)
34: tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]),     names[nh], parentenv, handlers[[nh]])
35: tryCatchList(expr, classes, parentenv, handlers)
36: tryCatch(evaluate(request$content$code, envir = .GlobalEnv, output_handler = oh,     stop_on_error = 1L), interrupt = function(cond) interrupted <<- TRUE,     error = .self$handle_error)
37: executor$execute(msg)
38: handle_shell()
39: kernel$run()
40: IRkernel::main()
An irrecoverable exception occurred. R is aborting now ...
ccruizm commented 5 years ago

I just tested the pipeline in my own Macbook and have the same problem. I do not know where the problem is yet and why this is happening.

Screenshot 2019-09-08 at 09 26 44

This is the info of my session:

R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_3.2.1     viridisLite_0.3.0 SnapATAC_1.0.0    rhdf5_2.28.0      Matrix_1.2-17    

loaded via a namespace (and not attached):
 [1] locfit_1.5-9.1         tidyselect_0.2.5       purrr_0.3.2            lattice_0.20-38        colorspace_1.4-1       doSNOW_1.0.18         
 [7] snow_0.4-3             stats4_3.6.0           rlang_0.4.0            pillar_1.4.2           glue_1.3.1             withr_2.1.2           
[13] BiocGenerics_0.30.0    plot3D_1.1.1           RColorBrewer_1.1-2     GenomeInfoDbData_1.2.1 foreach_1.4.7          plyr_1.8.4            
[19] zlibbioc_1.30.0        munsell_0.5.0          gtable_0.3.0           codetools_0.2-16       labeling_0.3           misc3d_0.8-4          
[25] IRanges_2.18.2         doParallel_1.0.15      GenomeInfoDb_1.20.0    irlba_2.3.3            parallel_3.6.0         Rcpp_1.0.2            
[31] edgeR_3.26.8           scales_1.0.0           limma_3.40.6           S4Vectors_0.22.0       XVector_0.24.0         gridExtra_2.3         
[37] RANN_2.6.1             Rtsne_0.15             dplyr_0.8.3            GenomicRanges_1.36.0   grid_3.6.0             tools_3.6.0           
[43] bitops_1.0-6           magrittr_1.5           RCurl_1.95-4.12        lazyeval_0.2.2         tibble_2.1.3           crayon_1.3.4          
[49] bigmemory.sri_0.1.3    bigmemory_4.5.33       pkgconfig_2.0.2        assertthat_0.2.1       rstudioapi_0.10        iterators_1.0.12      
[55] viridis_0.5.1          Rhdf5lib_1.6.0         R6_2.4.0               igraph_1.2.4.1         compiler_3.6.0  
r3fang commented 5 years ago

can you set num.cores=1? and try it again?

-- Rongxin Fang Ph.D. Student, Ren Lab Ludwig Institute for Cancer Research University of California, San Diego

On Sep 8, 2019, at 12:35 AM, Cristian notifications@github.com wrote:

I just tested the pipeline in my own Macbook and have the same problem. I do not know where the problem is yet and why this is happening. https://user-images.githubusercontent.com/37718031/64485053-7f1bf780-d21b-11e9-9991-e6d3ace2db20.png This is the info of my session:

R version 3.6.0 (2019-04-26) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Mojave 10.14.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ggplot2_3.2.1 viridisLite_0.3.0 SnapATAC_1.0.0 rhdf5_2.28.0 Matrix_1.2-17

loaded via a namespace (and not attached): [1] locfit_1.5-9.1 tidyselect_0.2.5 purrr_0.3.2 lattice_0.20-38 colorspace_1.4-1 doSNOW_1.0.18
[7] snow_0.4-3 stats4_3.6.0 rlang_0.4.0 pillar_1.4.2 glue_1.3.1 withr_2.1.2
[13] BiocGenerics_0.30.0 plot3D_1.1.1 RColorBrewer_1.1-2 GenomeInfoDbData_1.2.1 foreach_1.4.7 plyr_1.8.4
[19] zlibbioc_1.30.0 munsell_0.5.0 gtable_0.3.0 codetools_0.2-16 labeling_0.3 misc3d_0.8-4
[25] IRanges_2.18.2 doParallel_1.0.15 GenomeInfoDb_1.20.0 irlba_2.3.3 parallel_3.6.0 Rcpp_1.0.2
[31] edgeR_3.26.8 scales_1.0.0 limma_3.40.6 S4Vectors_0.22.0 XVector_0.24.0 gridExtra_2.3
[37] RANN_2.6.1 Rtsne_0.15 dplyr_0.8.3 GenomicRanges_1.36.0 grid_3.6.0 tools_3.6.0
[43] bitops_1.0-6 magrittr_1.5 RCurl_1.95-4.12 lazyeval_0.2.2 tibble_2.1.3 crayon_1.3.4
[49] bigmemory.sri_0.1.3 bigmemory_4.5.33 pkgconfig_2.0.2 assertthat_0.2.1 rstudioapi_0.10 iterators_1.0.12
[55] viridis_0.5.1 Rhdf5lib_1.6.0 R6_2.4.0 igraph_1.2.4.1 compiler_3.6.0
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/r3fang/SnapATAC/issues/89?email_source=notifications&email_token=ABT6GGZRKF63PIQHX7KM3W3QISTNXA5CNFSM4IUQ2VWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6FJ4SY#issuecomment-529178187, or mute the thread https://github.com/notifications/unsubscribe-auth/ABT6GG33TXE74WCHDAMHJH3QISTNXANCNFSM4IUQ2VWA.

r3fang commented 5 years ago

if you set num.cores=1 and bin.size=5000, does it work?

Rongxin Fang Ph.D. Student, Ren Lab Ludwig Institute for Cancer Research University of California, San Diego

On Sep 9, 2019, at 9:04 AM, Rongxin Fang r3fang@eng.ucsd.edu wrote:

can you set num.cores=1? and try it again?

-- Rongxin Fang Ph.D. Student, Ren Lab Ludwig Institute for Cancer Research University of California, San Diego

On Sep 8, 2019, at 12:35 AM, Cristian <notifications@github.com mailto:notifications@github.com> wrote:

I just tested the pipeline in my own Macbook and have the same problem. I do not know where the problem is yet and why this is happening. https://user-images.githubusercontent.com/37718031/64485053-7f1bf780-d21b-11e9-9991-e6d3ace2db20.png This is the info of my session:

R version 3.6.0 (2019-04-26) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Mojave 10.14.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ggplot2_3.2.1 viridisLite_0.3.0 SnapATAC_1.0.0 rhdf5_2.28.0 Matrix_1.2-17

loaded via a namespace (and not attached): [1] locfit_1.5-9.1 tidyselect_0.2.5 purrr_0.3.2 lattice_0.20-38 colorspace_1.4-1 doSNOW_1.0.18
[7] snow_0.4-3 stats4_3.6.0 rlang_0.4.0 pillar_1.4.2 glue_1.3.1 withr_2.1.2
[13] BiocGenerics_0.30.0 plot3D_1.1.1 RColorBrewer_1.1-2 GenomeInfoDbData_1.2.1 foreach_1.4.7 plyr_1.8.4
[19] zlibbioc_1.30.0 munsell_0.5.0 gtable_0.3.0 codetools_0.2-16 labeling_0.3 misc3d_0.8-4
[25] IRanges_2.18.2 doParallel_1.0.15 GenomeInfoDb_1.20.0 irlba_2.3.3 parallel_3.6.0 Rcpp_1.0.2
[31] edgeR_3.26.8 scales_1.0.0 limma_3.40.6 S4Vectors_0.22.0 XVector_0.24.0 gridExtra_2.3
[37] RANN_2.6.1 Rtsne_0.15 dplyr_0.8.3 GenomicRanges_1.36.0 grid_3.6.0 tools_3.6.0
[43] bitops_1.0-6 magrittr_1.5 RCurl_1.95-4.12 lazyeval_0.2.2 tibble_2.1.3 crayon_1.3.4
[49] bigmemory.sri_0.1.3 bigmemory_4.5.33 pkgconfig_2.0.2 assertthat_0.2.1 rstudioapi_0.10 iterators_1.0.12
[55] viridis_0.5.1 Rhdf5lib_1.6.0 R6_2.4.0 igraph_1.2.4.1 compiler_3.6.0
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/r3fang/SnapATAC/issues/89?email_source=notifications&email_token=ABT6GGZRKF63PIQHX7KM3W3QISTNXA5CNFSM4IUQ2VWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6FJ4SY#issuecomment-529178187, or mute the thread https://github.com/notifications/unsubscribe-auth/ABT6GG33TXE74WCHDAMHJH3QISTNXANCNFSM4IUQ2VWA.

ccruizm commented 5 years ago

Hello @r3fang,

Indeed it works! I did try before setting the num.cores =1 keeping the bin.size=1000 and always killed the session. Why do you think might be the problem?

r3fang commented 5 years ago

Do you know how many cells in the snap file?

On Sep 9, 2019, at 1:02 PM, Cristian notifications@github.com wrote:

Hello @r3fang https://github.com/r3fang,

Indeed it works! I did try before setting the num.cores =1 keeping the bin.size=1000 and always killed the session. Why do you think might be the problem?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/r3fang/SnapATAC/issues/89?email_source=notifications&email_token=ABT6GG3QHTR4EDFL4WV53MDQI2TWRA5CNFSM4IUQ2VWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6I3GVA#issuecomment-529642324, or mute the thread https://github.com/notifications/unsubscribe-auth/ABT6GG46BYZ65U4CZFTGRSTQI2TWRANCNFSM4IUQ2VWA.

ccruizm commented 5 years ago

5000 cells

r3fang commented 5 years ago

I think this is a memory issue. Loading the 1kb matrix exceeds the max memory of R. My suggestion - either increase the R memory thoreshold or use larger bin size. Our in house benchmarking shows there is not much advantage for using 1kb bin

Sent from my iPhone

On Sep 10, 2019, at 2:29 AM, Cristian notifications@github.com wrote:

5000 cells

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

ccruizm commented 4 years ago

Thank you very much for your help!