10XGenomics / cellranger

10x Genomics Single Cell Analysis
https://www.10xgenomics.com/support/software/cell-ranger
Other
348 stars 92 forks source link

Saving memory with np.argsort #86

Closed andreasg123 closed 4 years ago

andreasg123 commented 4 years ago

If you want to save memory here, wouldn't this work? https://github.com/10XGenomics/cellranger/commit/b9516e74ef69c07776065c547038c85847a797c1#diff-eb6a1b3cecae573836dc147984d3a928

srt_order = np.argsort(counts_per_bc)[::-1]

You would save creating the Python list of integers. Instead, you create a Numpy array. The slice operation for reversing the array is O(1) because it creates a non-contiguous view.

As a small caveat, if you rely on the sort being stable, the order of the same counts may be different with this approach.

evolvedmicrobe commented 4 years ago

Wow, surprised you found that, and indeed what you suggest is a good idea!

This version of the code is actually outdate relative to what we have now though. To fix a separate issue we had to make the sort condition depend on a few different conditions and so can't use the simpler np.argsort(counts_per_bc)[::-1]. However, very glad to see that insight, please let us know if you have any others.

Cheers, Nigel