broadinstitute / CellBender

CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
https://cellbender.rtfd.io
BSD 3-Clause "New" or "Revised" License
271 stars 50 forks source link

Computing the output in asynchronous chunks in parallel takes longer than 144 hours #362

Open Erhei130 opened 1 month ago

Erhei130 commented 1 month ago

Hello,

Thank you for creating this very useful tool. We have been using Cellbender since version 0.1.0 and have applied it to many different projects. However, recently when using version 0.3.0 on certain samples, Cellbender always gets stuck at ‘cellbender:remove-background: Computing the output in asynchronous chunks in parallel…’ step, regardless of whether we use the CPU or GPU, and this step cannot be completed within 144 hours ( the maximum allowed hours of our HPC queue). These problematic samples generally have more than 25k cells, so we are wondering if it could be related to having too many cells. When I use this version on my other samples with around 10k cells, there are no issues, and I get output very quickly. Below are my code and output. Looking forward to hearing from you. Thank you!

cellbender remove-background \ --input raw_feature_bc_matrix.h5 \ --output cellbender_output.h5 \ --fpr 0.01 \ --epochs 150 \ --cpu-threads 20 \ --cuda \ --estimator-multiple-cpu

cellbender:remove-background: Command: cellbender remove-background --input raw_feature_bc_matrix.h5 --output cellbender_output.h5 --fpr 0.01 --epochs 150 --cpu-threads 20 --cuda --estimator-multiple-cpu cellbender:remove-background: CellBender 0.3.0 cellbender:remove-background: (Workflow hash 89a1b742b1) cellbender:remove-background: 2024-05-16 16:21:04 cellbender:remove-background: Running remove-background cellbender:remove-background: Loading data from raw_feature_bc_matrix.h5 cellbender:remove-background: CellRanger v3 format cellbender:remove-background: Features in dataset: 36601 Gene Expression cellbender:remove-background: Trimming features for inference. cellbender:remove-background: 31584 features have nonzero counts. cellbender:remove-background: Prior on counts for cells is 6989 cellbender:remove-background: Prior on counts for empty droplets is 1465 cellbender:remove-background: Excluding 9480 features that are estimated to have <= 0.1 background counts in cells. cellbender:remove-background: Including 22104 features in the analysis. cellbender:remove-background: Trimming barcodes for inference. cellbender:remove-background: Excluding barcodes with counts below 732 cellbender:remove-background: Using 5381 probable cell barcodes, plus an additional 26586 barcodes, and 57074 empty droplets. cellbender:remove-background: Largest surely-empty droplet has 1619 UMI counts. cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpvtnbsk1n cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpvtnbsk1n /tmp/tmpvtnbsk1n/89a1b742b1_random.pyro /tmp/tmpvtnbsk1n/89a1b742b1_random.cuda /tmp/tmpvtnbsk1n/89a1b742b1_model.torch /tmp/tmpvtnbsk1n/89a1b742b1_optim.torch /tmp/tmpvtnbsk1n/89a1b742b1_optim.pyro /tmp/tmpvtnbsk1n/89a1b742b1_params.pyro /tmp/tmpvtnbsk1n/89a1b742b1_train.loaderstate /tmp/tmpvtnbsk1n/89a1b742b1_test.loaderstate /tmp/tmpvtnbsk1n/89a1b742b1_args.npy /tmp/tmpvtnbsk1n/posterior.h5 cellbender:remove-background: Loaded partially-trained checkpoint from ckpt.tar.gz cellbender:remove-background: Checkpoint loaded successfully. cellbender:remove-background: Running inference... cellbender:remove-background: 2024-05-16 16:21:47 cellbender:remove-background: Inference procedure complete. cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpku49v76z cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpku49v76z /tmp/tmpku49v76z/89a1b742b1_random.pyro /tmp/tmpku49v76z/89a1b742b1_random.cuda /tmp/tmpku49v76z/89a1b742b1_model.torch /tmp/tmpku49v76z/89a1b742b1_optim.torch /tmp/tmpku49v76z/89a1b742b1_optim.pyro /tmp/tmpku49v76z/89a1b742b1_params.pyro /tmp/tmpku49v76z/89a1b742b1_train.loaderstate /tmp/tmpku49v76z/89a1b742b1_test.loaderstate /tmp/tmpku49v76z/89a1b742b1_args.npy /tmp/tmpku49v76z/posterior.h5 cellbender:remove-background: Loaded pre-computed posterior from posterior.h5 cellbender:remove-background: 2024-05-16 16:22:13

cellbender:remove-background: Saved summary plots as cellbender_output.pdf cellbender:remove-background: Saved cell barcodes in cellbender_output_cell_barcodes.csv cellbender:remove-background: Computing target noise counts per gene for MCKP estimator cellbender:remove-background: Using MCKP noise targets computed for FPR 0.01 cellbender:remove-background: Computing denoised counts using mckp estimator cellbender:remove-background: Dividing dataset into chunks of genes cellbender:remove-background: Computing the output in asynchronous chunks in parallel... User defined signal 2