carmonalab / STACAS

R package for semi-supervised single-cell data integration
GNU General Public License v3.0
75 stars 9 forks source link

Cholmod error 'problem too large' #8

Closed LauRich05 closed 3 years ago

LauRich05 commented 3 years ago

Hi there,

I am trying to run STACAS integration on a dataset comprised of ~270,000 cells from 52 samples. I keep running into the error below on the IntegrateData() step. Seurat forums recommend using their new "rpca" integration method to overcome this - but it hasn't performed as well as STACAS on my dataset, and I'm not sure how RPCA would integrate into the STACAS workflow. Do you have any workarounds to overcome this error on large datasets? I am running

Error in .cbind2Csp(x, y) : Cholmod error 'problem too large' at file ../Core/cholmod_sparse.c, line 89 Calls: ... cbind -> cbind2 -> cbind2 -> cbind2sparse -> .cbind2Csp Execution halted

mass-a commented 3 years ago

Hello Laura and thanks for the report.

It looks like your machine is running out of memory on this large dataset - how much RAM do you have?

You could try to downsample your dataset to reduce the number of cells (e.g. see https://github.com/satijalab/seurat/issues/1325), which could even help balancing out samples of very different sizes. But if you want to use all cells for integration, you may have to run the integration step on a machine with more RAM.

STACAS actually uses reciprocal PCA for integration, so I don't expect 'rpca' would be a solution.

Best, -m