scverse / rapids_singlecell

Rapids_singlecell: A GPU-accelerated tool for scRNA analysis. Offers seamless scverse compatibility for efficient single-cell data processing and analysis.
https://rapids-singlecell.readthedocs.io/
MIT License
149 stars 22 forks source link

Normalized expression values different from scanpy #208

Closed tjbencomo closed 5 months ago

tjbencomo commented 5 months ago

Describe the bug Hi - thanks for making this software! Very much needed, especially as single cell datasets get very large with spatial transcriptomics platforms. I've noticed some differences between scanpy and rapids_singlecell that are significantly affecting my results so I went back and started to compare how output from the two packages differed.

I've noticed that normalize_total seems to produce different results between rapids_singlecell and scanpy. I realize that not all functions are expected to produce the exact same results, but figured that normalization should be equivalent since it's such a deterministic and basic step. I've tried my best to create a good head to head comparison below where parameters are the same for both function calls.

Thanks in advance for the help!

Steps/Code to reproduce bug I downloaded the example dataset used in the Demo Workflow and Decoupler notebook using this URL from the data_downloader notebook that accompanies the demo notebook.

I then ran the following code:

import anndata as ad
import cupy as cp
import numpy as np
import rapids_singlecell as rsc
import scanpy as sc

import rmm
from rmm.allocators.cupy import rmm_cupy_allocator

rmm.reinitialize(
    managed_memory=True,  # Allows oversubscription
    pool_allocator=False,  # default is False
    devices=0,  # GPU device IDs to register. By default registers only GPU 0.
)
cp.cuda.set_allocator(rmm_cupy_allocator)
import gc

# Example dataset from Demo Workflow and Decoupler notebook
## cpudata: anndata object processed with scanpy
cpudata = sc.read_h5ad("/hpc/temp/setty_m/tbencomo/adata.raw_compressed.h5ad")
## gpudata: anndata object processed with rapids_singlecell 
gpudata = cpudata.copy()

rsc.get.anndata_to_GPU(gpudata)
rsc.pp.normalize_total(gpudata, target_sum=1e4)
sc.pp.normalize_total(cpudata, target_sum=1e4)
rsc.get.anndata_to_CPU(gpudata)

print(f"Number values not matching after normalization: {np.sum(cpudata.X != gpudata.X)}")

Expected behavior I expect the expression matrices to be equivalent after normalization by the two libraries. np.sum(cpudata.X != gpudata.X) should produce a value of 0.

Environment details (please complete the following information):

Intron7 commented 5 months ago

Hi @tjbencomo,

Thank you for your detailed report and for using our software.

This is not a bug but an inherent property of floating-point values. Unlike integers, floats are approximations of the real value, meaning that with every computational step, some form of rounding occurs. When you add parallel processing into the mix, this rounding happens asynchronously, which can also cause slight changes in the results.

Therefore, achieving exact matches between CPU and GPU computations with 32-bit floating-point (FP32) precision is generally not possible. In testing, we typically verify that values are roughly the same, usually within a tolerance of 1e-5.

If you require more precision, you might consider switching to 64-bit floats (FP64). However, be aware that this comes with a significant increase in computational cost and memory usage.

I hope this clears up your issue.

Best regards, Severin

tjbencomo commented 4 months ago

Thanks for the explanation!