NVIDIA-Genomics-Research / rapids-single-cell-examples

Examples of single-cell genomic analysis accelerated with RAPIDS
Apache License 2.0
324 stars 68 forks source link

[FEA] Add batching option to `filter_cells` and `filter_genes` in `rapids_scanpy_funcs` #53

Open cjnolet opened 3 years ago

cjnolet commented 3 years ago

Because the cusparse API uses 32-bit integers to specify the size of the underlying workspaces in GPU memory, and because the Scipy/Cupy sparse APIs use them to specify the size of the underlying matrices, very large datasets run into problems during the filtering of cells and genes. We can get around this constraint in two ways- we can chunk the data across different GPUs using Dask or we can batch the filters on a single GPU.

We should do this specifically for the 1M cells notebook, so that we can remove the on_device argument.