why the "Unable to allocate 121. GiB for an array with shape (26734, 606219) and data type float64 every time I run "scib.integration.combat(adata, batch= "sample")"

theislab / scib

Benchmarking analysis of data integration tools

MIT License

311 stars 63 forks source link

Hi, there are a couple of things you can do to reduce the memory footprint. Usually float32 should be more than sufficient for transcriptomics data. And usually we recommend removing unexpressed genes and selecting e.g. highly variable genes to reduce noise in the dataset. 60K genes is quite a lot, we usually work in the ballpark of 2k-10k genes. And finally make sure you're using sparse matrices for a reduced memory and storage footprint.

If these don't work, please consider using the combat function implemented in scanpy https://scanpy.readthedocs.io/en/latest/api/generated/scanpy.pp.combat.html

theislab / scib

why the "Unable to allocate 121. GiB for an array with shape (26734, 606219) and data type float64 every time I run "scib.integration.combat(adata, batch= "sample")" #399