broadinstitute / CellBender

CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
https://cellbender.rtfd.io
BSD 3-Clause "New" or "Revised" License
272 stars 50 forks source link

CUDA out of memory at MCKP estimator step #295

Open Terkild opened 9 months ago

Terkild commented 9 months ago

I am having issues running cellbender on a 10X scRNA-seq run containing ~20.000 cells and includes CITE-seq for both surface antigens (ADT) and sample multiplexing (cell hashing; HTO). As suggested in other GitHub issues, I combined the gene expression with the ADT and HTO matrices in a single anndata file (using anndata.concat) which I used as the input for cellbender. I initially ran out of CUDA memory during inference (by GPU only has 2GB RAM) and followed suggestions here to reduce --posterior-batch-size and increase --projected-ambient-count-threshold.

cellbender remove-background --cuda \ --input combined_matrix/sample1/combined_matrix.h5ad \ --output cellbender/sample1/feature_bc_matrix.h5 \ --projected-ambient-count-threshold 2 \ --posterior-batch-size 32 \ --estimator-multiple-cpu \ --cpu-threads 32

This allowed inference to be completed, but now I get a similar CUDA out of memory error during the MCKP estimator step:

` cellbender:remove-background: Working on chunk (834/834) cellbender:remove-background: Writing full posterior to cellbender/sample1/feature_bc_matrix_posterior.h5 cellbender:remove-background: Succeeded in writing posterior to file cellbender/sample1/feature_bc_matrix_posterior.h5 cellbender:remove-background: Added posterior object to checkpoint file. cellbender:remove-background: 2023-10-03 11:58:15

cellbender:remove-background: Saved summary plots as cellbender/sample1/feature_bc_matrix.pdf cellbender:remove-background: Saved cell barcodes in cellbender/sample1/feature_bc_matrix_cellbarcodes.csv cellbender:remove-background: Computing target noise counts per gene for MCKP estimator Traceback (most recent call last): File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/bin/cellbender", line 8, in sys.exit(main()) File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695_/lib/python3.7/site-packages/cellbender/base_cli.py", line 118, in main clidict[args.tool].run(args) File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/lib/python3.7/site-packages/cellbender/removebackground/cli.py", line 185, in run return main(args) File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/lib/python3.7/site-packages/cellbender/remove_background/cli.py", line 230, in main posterior = run_removebackground(args) File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/lib/python3.7/site-packages/cellbender/remove_background/run.py", line 133, in run_remove_background file_name=filename, File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/lib/python3.7/site-packages/cellbender/remove_background/run.py", line 237, in compute_output_denoised_counts_reports_metrics pergene=True, File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/lib/python3.7/site-packages/cellbender/remove_background/posterior.py", line 1579, in compute_mean_target_removal_asfunction device=device, File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/lib/python3.7/site-packages/cellbender/remove_background/estimation.py", line 147, in estimatenoise device=device) File "/data/.snakemake/conda/e6bb4ae8648eaa8eff6d7ad116bc7695/lib/python3.7/site-packages/cellbender/remove_background/estimation.py", line 818, in apply_function_dense_chunks dense_tensor = torch.tensor(log_prob_sparse_to_dense(coo)).to(device) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.93 GiB (GPU 0; 1.94 GiB total capacity; 170.25 MiB already allocated; 1.17 GiB free; 210.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF `

I tried rerunning cellbender without "--cuda" and it resumed the run after inference, but still failed, albeit with a different error not related to CUDA memory.

How can I reduce GPU memory requirements of the MCKP step? It seems to be unrelated to the batch-size?

liboxun commented 5 months ago

I have the same exact problem.

Head of error log:

cellbender:remove-background: Command: cellbender remove-background --cuda --posterior-batch-size 64 --input /hpc/group/gersbachlab/boxun.li/Ana_Exp4/03.single-cell/CRISPRi_cCRE/t1/v1/count/forced_24000/pool1/outs/raw_feature_bc_matrix.h5 --output /hpc/group/gersbachlab/boxun.li/Ana_Exp4/03.single-cell/CRISPRi_cCRE/t1/v2/cellbender_outputs/pool1/cellbender_output.h5 cellbender:remove-background: CellBender 0.3.0 cellbender:remove-background: (Workflow hash bd48796610) cellbender:remove-background: 2024-01-12 16:39:24 cellbender:remove-background: Running remove-background cellbender:remove-background: Loading data from /hpc/group/gersbachlab/boxun.li/Ana_Exp4/03.single-cell/CRISPRi_cCRE/t1/v1/count/forced_24000/pool1/outs/raw_feature_bc_matrix.h5 cellbender:remove-background: CellRanger v3 format cellbender:remove-background: Features in dataset: 3344 CRISPR Guide Capture, 36601 Gene Expression cellbender:remove-background: Trimming features for inference.

Tail of error log (showing error message):

file_name=file_name,

File "/hpc/group/gersbachlab/boxun.li/Software/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/run.py", line 237, in compute_output_denoised_counts_reports_metrics per_gene=True, File "/hpc/group/gersbachlab/boxun.li/Software/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/posterior.py", line 1579, in compute_mean_target_removal_as_function device=device, File "/hpc/group/gersbachlab/boxun.li/Software/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/estimation.py", line 147, in estimate_noise device=device) File "/hpc/group/gersbachlab/boxun.li/Software/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/estimation.py", line 818, in apply_function_dense_chunks dense_tensor = torch.tensor(log_prob_sparse_to_dense(coo)).to(device) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.95 GiB (GPU 0; 10.58 GiB total capacity; 325.81 MiB already allocated; 1.59 GiB free; 350.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Full error log: JobLog.396986.err.txt

meitalmaz commented 5 months ago

i also ran into this error, nothing seems to work.

ellbender remove-background --cuda --projected-ambient-count-threshold 2 --input ../cellranger_results/sample_4495/raw_feature_bc_matrix.h5 --output filtered_feature_bc_matrix.h5 --posterior-batch-size 64 --estimator-multiple-cpu --cpu-threads 50

the error message: File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/cli.py", line 185, in run return main(args) File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/cli.py", line 230, in main posterior = run_remove_background(args) File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/run.py", line 133, in run_remove_background file_name=file_name, File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/run.py", line 237, in compute_output_denoised_counts_reports_metrics per_gene=True, File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/posterior.py", line 1579, in compute_mean_target_removal_as_function device=device, File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/estimation.py", line 147, in estimate_noise device=device) File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/estimation.py", line 822, in apply_function_dense_chunks s = fun(dense_tensor, **kwargs) File "/home/meital/miniconda3/envs/cellbender/lib/python3.7/site-packages/cellbender/remove_background/estimation.py", line 143, in _torch_mean return torch.matmul(x.exp(), c.t()) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.23 GiB (GPU 0; 5.80 GiB total capacity; 3.42 GiB already allocated; 2.17 GiB free; 3.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

full log: filtered_feature_bc_matrix.log

liboxun commented 5 months ago

I should update that requesting a GPU node with bigger memory (>320Gb) from our computing cluster worked for me, but of course this was a brute-force solution that might not work for everyone.

madubata commented 5 months ago

Hi, thank you for the CellBender tool. I am also having the same error on our GPU node, as seen in the attached log files. I am running CellBender in a conda environment on our Nvidia A40 GPU with 1024 GiB RAM and 104 logical cores. I am unable to request a higher memory node.

command in file logs.GPU_default.txt: cellbender remove-background --cuda --input /mainlab/data1/ix/MMMMM/staging/9999-GG/10x_analysis_9999-GG/Sample_9999-GG-2/raw_feature_bc_matrix.h5 --output /mainlab/data1/DSCoLab/MMMMM/cellbender/data/MMMMM-POOL-UM1-XXX2/GPU_learning_rate_0.0001/MMMMM-POOL-UM1-XXX2_GPU.h5

This error persists even after using the suggested options above: --posterior-batch-size 64 --projected-ambient-count-threshold 2

command in file logs.GPU_parameters.txt: cellbender remove-background --cuda --input /mainlab/data1/ix/MMMMM/staging/9999-GG/10x_analysis_9999-GG/Sample_9999-GG-2/raw_feature_bc_matrix.h5 --posterior-batch-size 64 --projected-ambient-count-threshold 2 --output /mainlab/data1/DSCoLab/MMMMM/cellbender/data/MMMMM-POOL-UM1-XXX2/GPU_learning_rate_0.0001/MMMMM-POOL-UM1-XXX2_GPU.h5

Excerpt of error message: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 40.00 MiB (GPU 0; 15.73 GiB total capacity; 492.00 MiB already allocated; 11.12 MiB free; 512.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

logs.GPU_default.txt logs.GPU_parameters.txt