broadinstitute / CellBender

CellBender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
https://cellbender.rtfd.io
BSD 3-Clause "New" or "Revised" License
288 stars 52 forks source link

How can I request cellbender to limit/use more threads/cores? #160

Closed angelasanzo closed 1 year ago

angelasanzo commented 2 years ago

Hi there,

I have tried to find a way to limit or increase the number of threads that is using, but have not found a result. Any argument that can be provided?

Thank you very much.

Angela

sjfleming commented 2 years ago

Hi @angelasanzo , unfortunately, I don't know of a good way to make this happen. CellBender does not have an input argument to enable this currently.

(If anybody else knows how to do this, please post here!) It looks like there might be some ways to get pytorch to run using multiple threads on CPU. Unfortunately I have never tried this, and don't know the benefits / limitations.

I'm guessing you are running on a CPU? Is that right?

sjfleming commented 2 years ago

Note to self: it is possible this is as simple as

torch.set_num_threads(n)

Test this.

sjfleming commented 2 years ago

Potentially useful example by Bo Li and collaborators here https://github.com/lilab-bcb/harmony-pytorch/blob/436e53cb3db1e51f3e23eb63fe2f3863e9e31f2d/harmony/harmony.py#L115

angelasanzo commented 1 year ago

Thanks a lot for your work! Yes, we are using CPU as CUDA cannot be performed under our graphics card.

sjfleming commented 1 year ago

Unfortunately, I tested

import psutil
n_jobs = psutil.cpu_count(logical=False)  # get physical cores
if n_jobs is None:
    n_jobs = psutil.cpu_count(logical=True)  # if undetermined, use logical cores instead
torch.set_num_threads(n_jobs)

and it doesn't make any difference in terms of runtime. So it's not that simple...

Will keep thinking...

sjfleming commented 1 year ago

Eh, okay, it's possible that using the number of logical cores is better. But I'm only seeing a speedup of like 4% :(

Still, I might include it as an input argument in the future. Default will use the number of logical cores.

sjfleming commented 1 year ago

Closed by #238 The input argument is --cpu-threads. But limited testing shows me it unfortunately does not seem to make a big difference.

BradBalderson commented 10 months ago

Hey @sjfleming,

Maybe off topic but it seems that the --cpu-threads is causing a stall when writing out the output? I ran cellbender like this:

cellbender remove-background --cuda --estimator-multiple-cpu --cpu-threads 21 --epochs 40 --checkpoint-mins 60 --input {sample_input_counts} --output {sample_out_dir}

It gives me this output: cellbender:remove-background: Command: cellbender remove-background --cuda --estimator-multiple-cpu --cpu-threads 21 --epochs 40 --checkpoint-mins 60 --input /home/jovyan/data4/UCSD_Oxy-multiome/Oxy-multiome//FTL_702_M957/outs/raw_feature_bc_matrix.h5 --output /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/ cellbender:remove-background: CellBender 0.3.0 cellbender:remove-background: (Workflow hash 8a1259eac2) cellbender:remove-background: 2023-12-04 17:46:12 cellbender:remove-background: Running remove-background cellbender:remove-background: Loading data from /home/jovyan/data4/UCSD_Oxy-multiome/Oxy-multiome//FTL_702_M957/outs/raw_feature_bc_matrix.h5 cellbender:remove-background: CellRanger v3 format cellbender:remove-background: Features in dataset: 33294 Gene Expression, 96665 Peaks cellbender:remove-background: Trimming features for inference. cellbender:remove-background: 122701 features have nonzero counts. cellbender:remove-background: Prior on counts for cells is 4609 cellbender:remove-background: Prior on counts for empty droplets is 72 cellbender:remove-background: Excluding 11105 features that are estimated to have <= 0.1 background counts in cells. cellbender:remove-background: Including 111596 features in the analysis. cellbender:remove-background: Trimming barcodes for inference. cellbender:remove-background: Excluding barcodes with counts below 36 cellbender:remove-background: Using 10437 probable cell barcodes, plus an additional 24850 barcodes, and 49210 empty droplets. cellbender:remove-background: Largest surely-empty droplet has 126 UMI counts. cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpjiir3c4e cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpjiir3c4e /tmp/tmpjiir3c4e/8a1259eac2_params.pyro /tmp/tmpjiir3c4e/8a1259eac2_random.cuda /tmp/tmpjiir3c4e/8a1259eac2_optim.torch /tmp/tmpjiir3c4e/posterior.h5 /tmp/tmpjiir3c4e/8a1259eac2_args.npy /tmp/tmpjiir3c4e/8a1259eac2_optim.pyro /tmp/tmpjiir3c4e/8a1259eac2_model.torch /tmp/tmpjiir3c4e/8a1259eac2_train.loaderstate /tmp/tmpjiir3c4e/8a1259eac2_test.loaderstate /tmp/tmpjiir3c4e/8a1259eac2_random.pyro cellbender:remove-background: Loaded partially-trained checkpoint from ckpt.tar.gz cellbender:remove-background: Checkpoint loaded successfully. cellbender:remove-background: Running inference... cellbender:remove-background: 2023-12-04 17:47:38 cellbender:remove-background: Inference procedure complete. cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpheuaf78f cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpheuaf78f /tmp/tmpheuaf78f/8a1259eac2_params.pyro /tmp/tmpheuaf78f/8a1259eac2_random.cuda /tmp/tmpheuaf78f/8a1259eac2_optim.torch /tmp/tmpheuaf78f/posterior.h5 /tmp/tmpheuaf78f/8a1259eac2_args.npy /tmp/tmpheuaf78f/8a1259eac2_optim.pyro /tmp/tmpheuaf78f/8a1259eac2_model.torch /tmp/tmpheuaf78f/8a1259eac2_train.loaderstate /tmp/tmpheuaf78f/8a1259eac2_test.loaderstate /tmp/tmpheuaf78f/8a1259eac2_random.pyro cellbender:remove-background: Loaded pre-computed posterior from posterior.h5 cellbender:remove-background: 2023-12-04 17:48:53

cellbender:remove-background: Saved summary plots as /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/.pdf cellbender:remove-background: Saved cell barcodes in /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/_cell_barcodes.csv cellbender:remove-background: Computing target noise counts per gene for MCKP estimator cellbender:remove-background: Using MCKP noise targets computed for FPR 0.01 cellbender:remove-background: Computing denoised counts using mckp estimator cellbender:remove-background: Dividing dataset into chunks of genes

It seems to stall here, 21 threads are started, but they are using 0% CPU, and it does not write any output beyond the .pdf and the _cell_barcodes.tsv. I am about to try not setting the number of threads to see if it helps.