pbreheny / biglasso

biglasso: Extending Lasso Model Fitting to Big Data in R
http://pbreheny.github.io/biglasso/
113 stars 29 forks source link

Memory not mapped #18

Open fanzheng10 opened 6 years ago

fanzheng10 commented 6 years ago

This package has been quite useful for my research and thank the author for writing it. However, sometimes I experienced the following problem:

In short words, I have a function which calls biglasso, and I was trying to use mclapply function in the package parallel to have multiple cores to run lasso with same X (created by setupX) and different y.

Here is my function, I think procedures other than biglasso are unimportant:

  if (k > 1) {
    y <-sample(realy)
  } else {
    y <-realy
  }
  tryCatch( {
    time <- system.time(  
      fit <- biglasso(X, y, family='gaussian', penalty.factor = penalty, 
                    lambda.min = 0.0001, nlambda = 250)
    )
    print(time)
    # do the "maxfrac" operation
    coef = as.matrix(fit$beta)
    coef[coef < 0] <- 0.0
    coef = round(coef, digits = 6)
    colnames(coef) = fit$lambda
    if (k == 1) {
      write.table(coef, file=paste(outfname, ".coef", sep=""), sep="\t")
    }
    norm_beta = scale(coef, center=FALSE, scale=colSums(coef))
    beta_max = apply(norm_beta, 1, max)
  }, error = function(e) {
    return(rep(-1, dim(X)[2]+1)) } 
  )
  return(beta_max)
}

Then I use beta_max_all <- mclapply(1:(batch_size+1), bottleneck, mc.cores=n_cores, mc.cleanup=TRUE) for multi-core jobs.

I'd be appreciate if the author can help me figure out what causes the problem. I cannot see any rules about when such issue happens; it seem to occur randomly and sometimes the script just runs smoothly. I run my computing on an Sun Grid Engine Cluster.

Here is all the error message.

*** caught segfault ***
address 0x30, cause 'memory not mapped'

Traceback:
 1: biglasso(X, y, family = "gaussian", penalty.factor = penalty,     lambda.min = 1e-04, nlambda = 250)
 2: system.time(fit <- biglasso(X, y, family = "gaussian", penalty.factor = penalty,     lambda.min = 1e-04, nlambda = 250))
 3: doTryCatch(return(expr), name, parentenv, handler)
 4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 5: tryCatchList(expr, classes, parentenv, handlers)
 6: tryCatch({    time <- system.time(fit <- biglasso(X, y, family = "gaussian",         penalty.factor = penalty, lambda.min = 1e-04, nlambda = 250))    print(time)    coef = as.matrix(fit$beta)    coef[coef < 0] <- 0    coef = round(coef, digits = 6)    colnames(coef) = fit$lambda    if (k == 1) {        write.table(coef, file = paste(outfname, ".coef", sep = ""),             sep = "\t")    }    norm_beta = scale(coef, center = FALSE, scale = colSums(coef))    beta_max = apply(norm_beta, 1, max)}, error = function(e) {    return(rep(-1, dim(X)[2] + 1))})
 7: FUN(X[[i]], ...)
 8: lapply(X = S, FUN = FUN, ...)
 9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        msg <- conditionMessage(e)        sm <- strsplit(msg, "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && identical(getOption("show.error.messages"),         TRUE)) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
13: try(lapply(X = S, FUN = FUN, ...), silent = TRUE)
14: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE))
15: FUN(X[[i]], ...)
16: lapply(seq_len(cores), inner.do)
17: mclapply(1:(batch_size + 1), bottleneck, mc.cores = n_cores,     mc.cleanup = TRUE)
18: system.time(beta_max_all <- mclapply(1:(batch_size + 1), bottleneck,     mc.cores = n_cores, mc.cleanup = TRUE))
An irrecoverable exception occurred. R is aborting now ...
YaohuiZeng commented 6 years ago

@fanzheng10 thanks for reporting this issue. Sincerely sorry for late response. However, it's not clear to identify root cause based on what you've provided.

Could you: 1) provide a complete MWE that I can run on my end; 2) list your R environment, OS info, etc..?

Would be more than happy to debug? Thanks!