lhe17 / nebula

GNU General Public License v2.0
28 stars 6 forks source link

Error in { : task 15 failed - "objective in x0 returns NA" #24

Closed Raghav1881 closed 1 year ago

Raghav1881 commented 1 year ago

Hello!

Thank you for developing Nebula, it had been working well until I recently split my Seurat object as I wanted to run nebula across clusters. I have split my seurat object across the ident 'subclass_labelDE', and have been getting this error only when I run nebula on each cluster, but works when the object is merged. I was wondering if you're familiar with any reasons this error could occur, as I am not sure where I should look. For context, here is the code I am currently using

suppressWarnings({
  library(Seurat)
  library(goseq)
  library(dplyr)
  library(org.Mm.eg.db)
  library(nebula)
  library(SingleCellExperiment)
  library(future)
})

plan(strategy = multicore(workers = 48))
setwd('/media/raghav1881/data_8tb/hippo_soupx/')
hip.combined <- readRDS('May24_with_peaks.rds')
DefaultAssay(hip.combined) <- 'RNA'

# Make df for corresponding ident (i.e. get all columns for ca3)
#dfgen <- function(nebdf, ident) {
#  cls <- grepl(ident, names(nebdf), ignore.case = T)
#  data <- nebdf[, ..cls, drop = FALSE]
#  data$gene <- nebdf$gene
#  return(data)
#}

# Split Seurat object based on clusters, then run nebula on each cluster
runNeb <- function(seu_obj, ident) {
  seu_list <- SplitObject(seu_obj, split.by = ident)
  neb_list <- list()
  ident_list <- levels(seu_obj)
  n <- 0
  for (k in seu_list) {
    n <- n + 1
    ei <- data.frame(genotype = k$genotype)
    nbd <- list(counts = GetAssayData(k, slot = 'counts'), pred = ei,
                gid = k$genotype)
    df <- model.matrix(~genotype, data = nbd$pred)
    neb <- nebula(count = nbd$counts, id = nbd$gid, pred = df, ncore = 24, 
                  method = 'HL', cpc = 0.05)
    neb_list[[ident_list[[n]]]] <- neb$summary
  }
  return(neb_list)
}

res <- runNeb(hip.combined, 'subclass_labelDE')

Thank you in advance, and I would be more than happy to provide more information. Raghav

lhe17 commented 1 year ago

Hi Raghav,

Thank you for your question.

I'm not quite sure, but it might be because you set random effects id=nbd$gid to be the same as the fixed-effects variable in nbd$pred. They all equal to k$genotype. Here, id should be individual ID, rather than genotype.

Best regards,

Liang

On 6/6/2023 8:57 PM, Raghav Sharma wrote:

Hello!

Thank you for developing Nebula, it had been working well until I recently split my Seurat object as I wanted to run nebula across clusters. I have split my seurat object across the ident 'subclass_labelDE', and have been getting this error only when I run nebula on each cluster, but works when the object is merged. I was wondering if you're familiar with any reasons this error could occur, as I am not sure where I should look. For context, here is the code I am currently using

suppressWarnings({ library(Seurat) library(goseq) library(dplyr) library(org.Mm.eg.db) library(nebula) library(SingleCellExperiment) library(future) })

plan(strategy = multicore(workers = 48)) setwd('/media/raghav1881/data_8tb/hippo_soupx/') hip.combined <- readRDS('May24_with_peaks.rds') DefaultAssay(hip.combined)<- 'RNA'

Make df for corresponding ident (i.e. get all columns for ca3)

dfgen <- function(nebdf, ident) {

cls <- grepl(ident, names(nebdf), ignore.case = T)

data <- nebdf[, ..cls, drop = FALSE]

data$gene <- nebdf$gene

return(data)

}

Split Seurat object based on clusters, then run nebula on each cluster

runNeb <- function(seu_obj,ident) { seu_list <- SplitObject(seu_obj,split.by = ident) neb_list <- list() ident_list <- levels(seu_obj) n <- 0 for (k in seu_list) { n <- n + 1 ei <- data.frame(genotype = k$genotype) nbd <- list(counts = GetAssayData(k,slot = 'counts'),pred = ei, gid = k$genotype) df <- model.matrix(~genotype,data = nbd$pred) neb <- nebula(count = nbd$counts,id = nbd$gid,pred = df,ncore = 24, method = 'HL',cpc = 0.05) neb_list[[ident_list[[n]]]]<- neb$summary } return(neb_list) }

res <- runNeb(hip.combined,'subclass_labelDE')

Thank you in advance, and I would be more than happy to provide more information. Raghav

— Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/24, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISURJGHCINKDCD7TB7NLXJ54TPANCNFSM6AAAAAAY42P7BM. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Raghav1881 commented 1 year ago

Hi Liang,

Thanks for the prompt reply and for catching that mistake. I fixed my code as follows:

suppressWarnings({
  library(Seurat)
  library(goseq)
  library(dplyr)
  library(org.Mm.eg.db)
  library(nebula)
  library(SingleCellExperiment)
  library(future)
})

plan(strategy = multicore(workers = 48))
setwd('/media/raghav1881/data_8tb/hippo_soupx/')
hip.combined <- readRDS('May24_with_peaks.rds')
DefaultAssay(hip.combined) <- 'RNA'

# Split Seurat object based on clusters, then run nebula on each cluster
runNeb <- function(seu_obj, ident) {
  seu_list <- SplitObject(seu_obj, split.by = ident)
  neb_list <- list()
  ident_list <- unique(seu_obj$subclass_labelDE)
  n <- 0
  for (k in seu_list) {
    n <- n + 1
    nbd <- list(counts = GetAssayData(k, slot = 'counts'),
                pred = data.frame(genotype = k$genotype),
                sid = k$sample)
    df <- model.matrix(~genotype, data = nbd$pred)
    neb <- nebula(count = nbd$counts, id = nbd$sid, pred = df, ncore = 24,
                  method = 'HL', cpc = 0.05)
    neb_list[[ident_list[[n]]]] <- neb$summary
  }
  return(neb_list)
}

res <- runNeb(hip.combined, 'subclass_labelDE')

Unfortunately, I am still getting the same error, except that it is task 4 that is failing. I have also provided traceback if this gives any better information.

Error in { : task 4 failed - "objective in x0 returns NA" 7. stop(simpleError(msg, call = expr)) 6. e$fun(obj, substitute(ex), parent.frame(), e$data) 5. (function (obj, ex) { e <- getDoPar() e$fun(obj, substitute(ex), parent.frame(), e$data) ... 4. do.call(`%dopar%`, list(obj, ex), envir = parent.frame()) 3. foreach(i = gid, .combine = "cbind") %dorng% { posv = call_posindy(count, i - 1, nind) if ((posv$mct * mfs) < 3) { ord = 3 ... 2. nebula(count = nbd$counts, id = nbd$sid, pred = df, ncore = 24, method = "HL", cpc = 5e-04) 1. runNeb(hip.combined, "subclass_labelDE")

Do you think this could be an error in regards to the Seurat object itself, I'm thinking some error is caused upon splitting my object based on clusters, as nebula works fine when I extract all data from the original merged object. I'm assuming this is an error being raised by some other dependant package, but I am not too familiar with tracing errors in R.

Thank you,

Raghav

lhe17 commented 1 year ago

Hi Raghav,

I do not think this is an error related to the Seurat object.

I would suggest using tryCatch to pinpoint the cluster that causes the error. I guess some "irregular" cluster probably leads to the problem. By irregular, I mean, for example, a cluster has cells from only one person or all cells in this cluster share the same genotype, etc.

You might also try setting ncore = 1 to see whether this solves your problem.

Best regards,

Liang

On 6/7/2023 7:38 PM, Raghav Sharma wrote:

Hi Liang,

Thanks for the prompt reply and for catching that mistake. I fixed my code as follows:

suppressWarnings({ library(Seurat) library(goseq) library(dplyr) library(org.Mm.eg.db) library(nebula) library(SingleCellExperiment) library(future) })

plan(strategy = multicore(workers = 48)) setwd('/media/raghav1881/data_8tb/hippo_soupx/') hip.combined <- readRDS('May24_with_peaks.rds') DefaultAssay(hip.combined)<- 'RNA'

Split Seurat object based on clusters, then run nebula on each cluster

runNeb <- function(seu_obj,ident) { seu_list <- SplitObject(seu_obj,split.by = ident) neb_list <- list() ident_list <- unique(seu_obj$subclass_labelDE) n <- 0 for (k in seu_list) { n <- n + 1 nbd <- list(counts = GetAssayData(k,slot = 'counts'), pred = data.frame(genotype = k$genotype), sid = k$sample) df <- model.matrix(~genotype,data = nbd$pred) neb <- nebula(count = nbd$counts,id = nbd$sid,pred = df,ncore = 24, method = 'HL',cpc = 0.05) neb_list[[ident_list[[n]]]]<- neb$summary } return(neb_list) }

res <- runNeb(hip.combined,'subclass_labelDE')

Unfortunately, I am still getting the same error, except that it is task 4 that is failing. I have also provided traceback if this gives any better information.

Error in { : task 4 failed - "objective in x0 returns NA" |7. stop(simpleError(msg, call = expr)) 6. e$fun(obj, substitute(ex), parent.frame(), e$data) 5. (function (obj, ex) { e <- getDoPar() e$fun(obj, substitute(ex), parent.frame(), e$data) ... 4. do.call(%dopar%, list(obj, ex), envir = parent.frame()) 3. foreach(i = gid, .combine = "cbind") %dorng% { posv = call_posindy(count, i - 1, nind) if ((posv$mct * mfs) < 3) { ord = 3 ... 2. nebula(count = nbd$counts, id = nbd$sid, pred = df, ncore = 24, method = "HL", cpc = 5e-04) 1. runNeb(hip.combined, "subclass_labelDE") | Do you think this could be an error in regards to the Seurat object itself, I'm thinking some error is caused upon splitting my object based on clusters, as nebula works fine when I extract all data from the original merged object. I'm assuming this is an error being raised by some other dependant package, but I am not too familiar with tracing errors in R.

Thank you,

Raghav

— Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/24#issuecomment-1581249011, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUQ2VPPO4I2DGLPIC3TXKC4BFANCNFSM6AAAAAAY42P7BM. You are receiving this because you commented.Message ID: @.***>

Raghav1881 commented 1 year ago

Hi Liang,

I ran tryCatch and majority of my clusters still failed with the exception of five clusters which each had over 5000 cells. I had also removed some clusters that don't have cells from all samples. Is the error in objective referring to error = function(e) { ref = nlminb( start = c(para, 1, cell_init), objective = ptmg_ll ... ?

Here is the output from tryCatch:

Output ```R Error occurred for ident:In_Lamp5 < simpleError in { doFuture::registerDoFuture() lapply( seq_along(...future.x_ii), FUN = function(jj) { ...future.x_jj <- ...future.x_ii[[jj]] { { NULL i <- NULL } .doRNG.stream <- NULL } ...future.env <- environment() local({ for (name in names(...future.x_jj)) { assign(name, ...future.x_jj[[name]], envir = ...future.env, inherits = FALSE) } }) tryCatch({ { rngtools::RNGseed(.doRNG.stream) } { posv = call_posindy(count, i - 1, nind) if ((posv$mct * mfs) < 3) { ord = 3 } else { ord = 1 } para = c(log(posv$mct) - moffset, rep(0, nb - 1)) re_t = tryCatch({ ref = switch( opt, lbfgs = lbfgs( c(para, 1, cell_init), ptmg_ll_der, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = lower, upper = upper, control = list(ftol_abs = eps) ), trust = trust( objfun = ptmg_ll_der_hes3, c(para, 0, 0), 2, 100, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind ), stop("The argument opt should be lbfgs, or trust.") ) if (opt == "lbfgs") { ref = ref$par } else { ref = ref$argument ref[nb + 1] = median(c(exp(ref[nb + 1]), min[1], max[1])) ref[nb + 2] = median(c(exp(ref[nb + 2]), min[2], max[2])) } c(ref, 1) }, error = function(e) { ref = nlminb( start = c(para, 1, cell_init), objective = ptmg_ll, gradient = ptmg_der, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fam = id, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = lower, upper = upper ) c(ref$par, 0) }) conv = re_t[length(re_t)] fit = 1 lmct = log(posv$mct) betae = c(lmct - moffset, rep(0, nb - 1)) vare = re_t[(nb + 1):(nb + 2)] para = vare cv2p <- ifelse(ncell > 0, get_cv(offset, pred, re_t[1:nb], cell_ind, ncell, nind), cv2) gni = mfs * vare[2] if (method == "LN") { if (model == "NBGMM") { if (gni < cutoff_cell) { re_t = tryCatch({ bobyqa( para, pql_ll, reml = reml, eps = eps, ord = ord, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }, error = function(e) { bobyqa( para, pql_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }) vare = re_t$par[1:2] fit = 2 } else { kappa_obs = gni / (1 + cv2p) if ((kappa_obs < 20) | ((kappa_obs < kappa) & (vare[1] < (8 / kappa_obs)))) { varsig = tryCatch({ nlminb( para[1], pql_gamma_ll, gamma = para[2], betas = betae, intcol = intcol, reml = reml, eps = eps, ord = ord, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min[1], upper = max[1] ) }, error = function(e) { nlminb( para[1], pql_gamma_ll, gamma = para[2], betas = betae, intcol = intcol, reml = reml, eps = eps, ord = 1, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min[1], upper = max[1] ) }) vare[1] = varsig$par[1] fit = 3 } } } else { if (gni < cutoff_cell) { re_t = bobyqa( para, pql_nbm_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) vare = re_t$par[1:2] fit = 4 } else { varsig = nlminb( para[1], pql_nbm_gamma_ll, gamma = para[2], betas = betae, intcol = intcol, reml = reml, eps = eps, ord = 1, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min[1], upper = max[1] ) vare[1] = varsig$par[1] fit = 6 } } } else { if (model == "NBGMM") { re_t = tryCatch({ bobyqa( para, pql_ll, reml = reml, eps = eps, ord = ord, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }, error = function(e) { bobyqa( para, pql_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }) fit = 2 } else { re_t = bobyqa( para, pql_nbm_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) fit = 4 } vare = re_t$par[1:2] } betae[intcol] = betae[intcol] - vare[1] / 2 if (model == "NBGMM") { repml = opt_pml( pred, offset, posv$Y, fid - 1, cumsumy[i,], posind[[i]] - 1, posv$posindy, nb, nind, k, betae, vare, reml, 1e-06, 1 ) } else { repml = opt_pml_nbm( pred, offset, posv$Y, fid - 1, cumsumy[i,], posind[[i]] - 1, posv$posindy, nb, nind, k, betae, vare, reml, 1e-06, 1 ) } if (is.nan(repml$loglik)) { conv = -30 } else { if (repml$iter == 50) { conv = -20 } else { if (repml$damp == 11) { conv = -10 } else { if (repml$damp == 12) { conv = -40 } } } } restemp = c(repml$beta, vare[1], 1 / vare[2], diag(repml$var), conv, fit) if (covariance == TRUE) { restemp = c(restemp, repml$var[lower.tri(repml$var, diag = TRUE)]) } if (output_re == TRUE) { restemp = c(restemp, repml$logw) } restemp } }, error = identity) } ) }:task 15 failed - "objective in x0 returns NA" > ```
lhe17 commented 1 year ago

Hi Raghav, I've no idea why my previous reply two weeks ago did not show up on this thread.

I can try to replicate and figure out this issue if you could share the count matrix of any cluster that failed along with the design matrix and the individual IDs.

Best regards, Liang

On Fri, Jun 9, 2023 at 10:07 PM Raghav Sharma @.***> wrote:

Hi Liang,

I ran tryCatch and majority of my clusters still failed with the exception of five clusters which each had over 5000 cells. I had also removed some clusters that don't have cells from all samples. Here is the output from tryCatch: Output

Error occurred for ident:In_Lamp5< simpleError in { doFuture::registerDoFuture() lapply( seq_along(...future.x_ii), FUN = function(jj) { ...future.x_jj <- ...future.x_ii[[jj]] { { NULL i <- NULL } .doRNG.stream <- NULL } ...future.env <- environment() local({ for (name in names(...future.x_jj)) { assign(name, ...future.x_jj[[name]], envir = ...future.env, inherits = FALSE) } }) tryCatch({ { rngtools::RNGseed(.doRNG.stream) } { posv = call_posindy(count, i - 1, nind) if ((posv$mct mfs) < 3) { ord = 3 } else { ord = 1 } para = c(log(posv$mct) - moffset, rep(0, nb - 1)) re_t = tryCatch({ ref = switch( opt, lbfgs = lbfgs( c(para, 1, cell_init), ptmg_ll_der, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = lower, upper = upper, control = list(ftol_abs = eps) ), trust = trust( objfun = ptmg_ll_der_hes3, c(para, 0, 0), 2, 100, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind ), stop("The argument opt should be lbfgs, or trust.") ) if (opt == "lbfgs") { ref = ref$par } else { ref = ref$argument ref[nb + 1] = median(c(exp(ref[nb + 1]), min[1], max[1])) ref[nb + 2] = median(c(exp(ref[nb + 2]), min[2], max[2])) } c(ref, 1) }, error = function(e) { ref = nlminb( start = c(para, 1, cell_init), objective = ptmg_ll, gradient = ptmg_der, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fam = id, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = lower, upper = upper ) c(ref$par, 0) }) conv = re_t[length(re_t)] fit = 1 lmct = log(posv$mct) betae = c(lmct - moffset, rep(0, nb - 1)) vare = re_t[(nb + 1):(nb + 2)] para = vare cv2p <- ifelse(ncell > 0, get_cv(offset, pred, re_t[1:nb], cell_ind, ncell, nind), cv2) gni = mfs vare[2] if (method == "LN") { if (model == "NBGMM") { if (gni < cutoff_cell) { re_t = tryCatch({ bobyqa( para, pql_ll, reml = reml, eps = eps, ord = ord, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }, error = function(e) { bobyqa( para, pql_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }) vare = re_t$par[1:2] fit = 2 } else { kappa_obs = gni / (1 + cv2p) if ((kappa_obs < 20) | ((kappa_obs < kappa) & (vare[1] < (8 / kappa_obs)))) { varsig = tryCatch({ nlminb( para[1], pql_gamma_ll, gamma = para[2], betas = betae, intcol = intcol, reml = reml, eps = eps, ord = ord, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min[1], upper = max[1] ) }, error = function(e) { nlminb( para[1], pql_gamma_ll, gamma = para[2], betas = betae, intcol = intcol, reml = reml, eps = eps, ord = 1, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min[1], upper = max[1] ) }) vare[1] = varsig$par[1] fit = 3 } } } else { if (gni < cutoff_cell) { re_t = bobyqa( para, pql_nbm_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) vare = re_t$par[1:2] fit = 4 } else { varsig = nlminb( para[1], pql_nbm_gamma_ll, gamma = para[2], betas = betae, intcol = intcol, reml = reml, eps = eps, ord = 1, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min[1], upper = max[1] ) vare[1] = varsig$par[1] fit = 6 } } } else { if (model == "NBGMM") { re_t = tryCatch({ bobyqa( para, pql_ll, reml = reml, eps = eps, ord = ord, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }, error = function(e) { bobyqa( para, pql_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) }) fit = 2 } else { re_t = bobyqa( para, pql_nbm_ll, reml = reml, eps = eps, ord = 1, betas = betae, intcol = intcol, posindy = posv$posindy, X = pred, offset = offset, Y = posv$Y, n_one = posv$n_onetwo, ytwo = posv$ytwo, fid = fid, cumsumy = cumsumy[i,], posind = posind[[i]], nb = nb, k = k, nind = nind, lower = min, upper = max ) fit = 4 } vare = re_t$par[1:2] } betae[intcol] = betae[intcol] - vare[1] / 2 if (model == "NBGMM") { repml = opt_pml( pred, offset, posv$Y, fid - 1, cumsumy[i,], posind[[i]] - 1, posv$posindy, nb, nind, k, betae, vare, reml, 1e-06, 1 ) } else { repml = opt_pml_nbm( pred, offset, posv$Y, fid - 1, cumsumy[i,], posind[[i]] - 1, posv$posindy, nb, nind, k, betae, vare, reml, 1e-06, 1 ) } if (is.nan(repml$loglik)) { conv = -30 } else { if (repml$iter == 50) { conv = -20 } else { if (repml$damp == 11) { conv = -10 } else { if (repml$damp == 12) { conv = -40 } } } } restemp = c(repml$beta, vare[1], 1 / vare[2], diag(repml$var), conv, fit) if (covariance == TRUE) { restemp = c(restemp, repml$var[lower.tri(repml$var, diag = TRUE)]) } if (output_re == TRUE) { restemp = c(restemp, repml$logw) } restemp } }, error = identity) } ) }:task 15 failed - "objective in x0 returns NA" >

— Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/24#issuecomment-1585074249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUWGWPVKLYDALQWCZ33XKN67TANCNFSM6AAAAAAY42P7BM . You are receiving this because you commented.Message ID: @.***>

Raghav1881 commented 1 year ago

Hi Liang, No worries, I'm attaching a sample rds that has data organized the way you presented within the vignette. The model matrix was generated as: df <- model.matrix(~genotype, data = neb_lamp5$pred) and my function for nebula was: neb <- nebula(count = neb_lamp5$counts, id = neb_lamp5$sid, pred = df, ncore = 40, method = 'HL', cpc = 0.05, covariance = T)

RDS

Thanks again, and let me know if you require any other information. Raghav

lhe17 commented 1 year ago

Hi Raghav,

Thank you for sharing your data. I replicated the error you reported. I find that the error arises because the elements in your count matrix are not counts, but real numbers. Rounding to the integers solves the problem. neb <- nebula(count = round(neb_lamp5$counts), id = neb_lamp5$sid, pred = df, ncore = 40, method = 'HL', cpc = 0.05, covariance = T)

I'm just curious. Do you really need 40 cores for this data set? I got the results within around one minute with only two cores.

Best regards, Liang

On Mon, Jun 26, 2023 at 4:03 PM Raghav Sharma @.***> wrote:

Hi Liang, No worries, I'm attaching a sample rds that has data organized the way you presented within the vignette. The model matrix was generated as: df <- model.matrix(~genotype, data = neb_lamp5$pred and my function for nebula was: neb <- nebula(count = neb_lamp5$counts, id = neb_lamp5$sid, pred = df, ncore = 40, method = 'HL', cpc = 0.05, covariance = T)

RDS https://github.com/lhe17/nebula/files/11870128/nebtest.tar.gz

Thanks again, and let me know if you require any other information. Raghav

— Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/24#issuecomment-1607562055, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUTB754X64NXEGIBO5DXNGJDLANCNFSM6AAAAAAY42P7BM . You are receiving this because you commented.Message ID: @.***>

Raghav1881 commented 1 year ago

Hi Liang,

I'm surprised my original counts matrix has decimal values, I would've never even consider checking this, thank you.

The ncore = 40 was from the initial analysis across genotypes, and I just quickly copy and pasted my code to use in this small subset.

Also I have made a function in R for Seurat and SingleCellExperiment objects to convert them into the list format you use for nebula. Do you have a development branch for which I could make a pull request (I would be happy to add a section in the vignette as well).

Thank you again for all of your help, Raghav

matanhofree commented 8 months ago

I am getting a similar error when running on count data. I tried rounding which does not help.

I am running the following (v1.5.1): res1 <- nebula(round(data_g$count),data_g$id,pred=data_g$pred,offset=round(data_g$offset),ncore = 20,reml=1,model='NBLMM')

I also tried without offset.

This is the error: Error in { : task 6976 failed - "objective in x0 returns NA" In addition: Warning message: In nlminb(para[1], pql_nbm_gamma_ll, gamma = para[2], betas = betae, : NA/NaN function evaluation

lhe17 commented 8 months ago

Hi Matan,

Thank you for reporting this error. I think this is a different error, probably related to 'reml=1'. You might try 'reml=0' and see whether the error vanishes. I will appreciate it very much if you could share your data_g object with me so that I can look into this issue.

Best regards, Liang

On Tue, Jan 9, 2024 at 5:02 AM Matan Hofree @.***> wrote:

I am getting a similar error when running on count data. I tried rounding which does not help.

I am running the following (v1.5.1): res1 <- nebula(round(data_g$count),data_g$id,pred=data_g$pred,offset=round(data_g$offset),ncore = 20,reml=1,model='NBLMM')

I also tried without offset.

This is the error: Error in { : task 6976 failed - "objective in x0 returns NA" In addition: Warning message: In nlminb(para[1], pql_nbm_gamma_ll, gamma = para[2], betas = betae, : NA/NaN function evaluation

— Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/24#issuecomment-1882762487, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUWHIL4OMKYCAORHGQTYNUIUDAVCNFSM6AAAAAAY42P7BOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBSG43DENBYG4 . You are receiving this because you commented.Message ID: @.***>

matanhofree commented 8 months ago

Hi Liang,

Thank you for the prompt reply. The issue persists with reml=0. I am attaching an rds of my data_g object. error_data_g_t.rds.gz

I only have 3 samples in this test, hence I thought setting reml=1 was recommended.

Best, Matan

lhe17 commented 8 months ago

Thank you very much, Matan, for sharing your data. Yes, I replicated the error. The problem seems related to the input sparse count matrix, in which not every zero element is omitted as usual. I will try to upload a new version in a few days to resolve it.

Nevertheless, you should not rely on the results from NEBULA with only three samples and it is a good idea to resort to those methods dedicated to small samples as mentioned in the manual. You are right that using reml=1 helps when you have many variables compared to the sample size, but not when the sample size itself is very small.

Best regards, Liang

On Wed, Jan 10, 2024 at 3:39 AM Matan Hofree @.***> wrote:

Hi Liang,

Thank you for the prompt reply. The issue persists with reml=0. I am attaching an rds of my data_g object. error_data_g_t.rds.gz https://github.com/lhe17/nebula/files/13885128/error_data_g_t.rds.gz

I only have 3 samples in this test, hence I thought setting reml=0 was recommended.

Best, Matan

— Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/24#issuecomment-1884407018, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUT6TSLHITSYY5FQKITYNZHTZAVCNFSM6AAAAAAY42P7BOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBUGQYDOMBRHA . You are receiving this because you commented.Message ID: @.***>