NVIDIA-Genomics-Research / rapids-single-cell-examples

Examples of single-cell genomic analysis accelerated with RAPIDS
Apache License 2.0
318 stars 68 forks source link

Illegal Memory Access: 1.3M cells RTX A6000 48GB out of memory on scaling step #116

Closed jyuan-sd closed 1 year ago

jyuan-sd commented 1 year ago

Hello! I am currently trying to run the 1.3M mouse brain example notebook on my local server with a RTX A6000. On the scaling step where the cupy mean operation is run, the system runs out of VRAM due to the sparse_gpu_array already taking up ~41GB of VRAM. I was wondering how you got this step to run on 16GB cards? Is there a way to batch the scaling? Or perhaps off load it to CPU and load the final sparse matrix back on to the GPU?

Thank you in advance! Joe

jyuan-sd commented 1 year ago

If anyone is facing the same issue, I have solved it by replacing the scaling step code with the cupy zscore function that is UVM compatible. Now it should complete the z-score step by utilizing your system memory as well via UVM. Here's the code and reference below.

https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.stats.zscore.html

`%%time

Changed to use cupy functions which allow for UVM

sparse_gpu_array = zscore(sparse_gpu_array, axis=None, ) sparse_gpu_array = cp.clip(sparse_gpu_array, a_min = -10, a_max=10)`