NVIDIA-Genomics-Research / rapids-single-cell-examples

Examples of single-cell genomic analysis accelerated with RAPIDS
Apache License 2.0
318 stars 68 forks source link

rank_genes_groups fix #86

Closed Intron7 closed 1 year ago

Intron7 commented 2 years ago

Dear all,

I fixed the rank_genes_groups function. However I change quit a bit of the preprocessing e.g. masking from cupy to numpy. In the notebooks that I was able to test (so not the 1Million cell notebooks) this didn't negatively impact performance. I might have even sped it up a bit. However I don't know if this will be the case even for the 1 Million cells notebooks. So you might want to test this first before merging. I also changed the the input so that it now uses the adata object with a groupby variable that takes an .obs column. If you want me to I can also include GPU functions for diffusion maps and draw_graph (force atlas 2).

Intron7 commented 2 years ago

I was just able to confirm that this version is a lot faster than the version in the newest release. On my A100 80GB It runs in 1m 4 sec vs 1m 51 sec. It also avoids some errors that are still present and in the publication.

cjnolet commented 1 year ago

@Intron7, I know it's been awhile since you've opened this PR. If you can fix the merge conflicts here then I'll give it a review.