Open julvi opened 3 years ago
@julvi I am encountering the exact issue, how did you solve it?
I have also encountered this problem. Have you solved your problem? @julvi
I am currently facing this problem as well. the expression matrix for my scRNAseq reference is (33694 x 92385) is there any workaround to be able to create ExpressionSet object required to run MuSiC?
Same problem here. There is the SingleCellExperiment package that handle sparse matrices but bot sure it is supported by MuSiC.
Same problem. Any solutions?
Hi All,
Just saw this while passing by, I deal it this way (by converting part by part and then stitiching together) Its not the most optimized piece of code. But it does the job.
## x is the large sparse matrix in DgC
## ncol break is the number of columns in each small matrices you make
## before combining to not give an error due to large size of the original matrix
dGC_to_matrix <- function(x,ncol_break = 49999){
if(length(colnames(x))>(ncol_break+1)){
total_cols = length(colnames(x)) ## Total columns in the dgc matrix
the_seq <- c(seq(1,total_cols,ncol_break), total_cols) ## Make a sequence starting from
## 1 to the total number of columns in steps of 'ncol_breaks'
the_seq <- unique(the_seq) ## In case the total columns == last element of the_seq, we need to avoid potnetial duplicate
}
matrix_list <- list() ## make an empty list to store each part matrix
total_parts <- length(the_seq)-1 ## Number of poarts is one less than the sequence
for(i in 1:total_parts){
start_no = ifelse(i==1,1,the_seq[i]+1) ## Starts with 1,
##but next time it should start with the column after the last column in the last part matrix created
print(paste0(i, " is i"))
print(paste0("start_no is", start_no))
end_no = the_seq[i+1]
print(paste0("part_number:", i, ";cols-",start_no,":",end_no))
matrix_list[[i]] <- as.matrix(x[,start_no:end_no,drop = F])
}
return(do.call(cbind, matrix_list)) ### cbind the columns
}
Eg:-
full_mtx <- dGC_to_matrix(full_dgc, 49999)
Since MUSIC2 still uses ExpressionSet as input, and ExpressionSet does not accept sparse dgCMatrix, is there any other way to run MUSIC2 with sparse matrices?
Hi @xuranw,
The expression matrix of my reference scRNAseq dataset is huge (27804 genes x 118535 cells) and is readable on R as a dgCMatrix object. Unfortunately, the ExpressionSet function cannot handle the dgCMatrix-class:
And the as.matrix function cannot convert the dgCMatrix object into a normal matrix:
Do you have a workaround to run MuSiC on huge expression matrices?