welch-lab / liger

R package for integrating and analyzing multiple single-cell datasets
GNU General Public License v3.0
381 stars 78 forks source link

online_iNMF code check #245

Closed niehu2018 closed 2 years ago

niehu2018 commented 2 years ago

Source: https://github.com/welch-lab/liger/blob/57b9ffa5a720f7c028850dea493c1da6657a3a8c/R/rliger.R line 1629: processed = !is.null(X_new[[i]]@scale.data) This line should be : processed = !is.null(object@scale.data[[i]]) ?? When I give a liger object with multiple dataset, online_iNMF failed, the reason was "out of bounds".

cgao90 commented 2 years ago

Hi Niehu,

To help further investigate, could you share the code snippet associated with this error message?

Best, Chao

niehu2018 commented 2 years ago

Here are the code and log

code: library(rliger) stim = readRDS("pbmcs_stim.RDS") ctrl = readRDS("pbmcs_ctrl.RDS") ctrl = createLiger(list(a = ctrl, b = stim), remove.missing = F) # this may not make any sense, I just want to simulate my situation stim = createLiger(list(stim = stim), remove.missing = F)

stim = normalize(stim) stim = selectGenes(stim, var.thresh = 0.1, do.plot = F) stim = scaleNotCenter(stim) stim = online_iNMF(stim, k = 20, miniBatch_size = 5000, max.epochs = 5) stim = quantile_norm(stim) stim = runUMAP(stim)

ctrl = normalize(ctrl) ctrl@var.genes = stim@var.genes ctrl = scaleNotCenter(ctrl) comb = online_iNMF(stim, X_new = list(ctrl = ctrl, stim = stim), k = 40, max.epochs = 1)

log: New dataset 1 already preprocessed. New dataset 2 already preprocessed. Error in X_new[[i]] : subscript out of bounds

The pbmcs_stim.RDS and pbmcs_ctrl.RDS were downloaded from https://github.com/welch-lab/liger

cgao90 commented 2 years ago

Hi, the error here is caused by the fact that the stim object is being used as the 1st and 3rd input at the same time. This would confuse the algorithm make it think there are x2 number of cells in stim . The following should work:

ctrl = createLiger(list(b = ctrl_data), remove.missing = F) 
stim2 = createLiger(list(stim2 = stim_data), remove.missing = F) 
#normalization, assign variable genes, scaling, ...
comb = online_iNMF(stim, X_new = list(ctrl = ctrl, stim2 = stim2), k = 20, max.epochs = 1)