meichendong / SCDC

SCDC
42 stars 9 forks source link

Error when using SCDC_prop #37

Open doliv071 opened 1 year ago

doliv071 commented 1 year ago

Hi, I'm getting an error when attempting to perform SCDC with batches using SCDC_prop. The exact error I'm getting is:

Error in sc.basis$sigma[commongenes, ct.sub] : subscript out of bounds

I've tracked the error to SCDC_basis failing to assign row names to the sigma matrix. so that commongenes cannot be used to subset the matrix.

I've managed to work around the error by reassigning rownames from the mean.mat from which sigma is derived via

rownames(sigma) <- rownames(mean.mat)

(I checked on my local dataset that rownames(mean.mat) are identical to the names of the returned vector from the sapply function that builds the sigma object)

doliv071 commented 1 year ago

Clarification, I think that this issue only effects datasets where the celltype is only available in one subject.

In the code that generates the sigma matrix, if dims of y is null then res = rep(0, length(y)) which does not have names, so when sapply, attempts to simplify the list to a matrix, if the very first id does not have replicated subjects it gets no names and the resulting matrix also has no names...

    sigma <- sapply(unique(mean.id[, 1]), function(id) {
        y = mean.mat[, mean.id[, 1] %in% id]
        if (is.null(dim(y))) {
            res = rep(0, length(y))
            message("Warning: the cell type [", id, "] is only available in at most 1 subject!")
        } else {
            res = apply(y, 1, var, na.rm = TRUE)
        }
        return(res)
    })

It is an interesting edge case and quite reproducible. the follow code will generate a matrix with rownames:

sapply(1:3, \(x){
    if(x %% 2 == 0){
        res = rep(0, 3)
    } else {
        res = setNames(rep(1, 3), paste0("r", 1:3))
    }
    return(res)
})
# returns:
   [,1] [,2] [,3]
r1    1    0    1
r2    1    0    1
r3    1    0    1

while this code snippet will not:

sapply(1:3, \(x){
    if(x %% 2 != 0){
        res = rep(0, 3)
    } else {
        res = setNames(rep(1, 3), paste0("r", 1:3))
    }
    return(res)
})
# returns:
     [,1] [,2] [,3]
[1,]    0    1    0
[2,]    0    1    0
[3,]    0    1    0