Closed angelasanzo closed 1 year ago
Hi @angelasanzo , unfortunately, I don't know of a good way to make this happen. CellBender does not have an input argument to enable this currently.
(If anybody else knows how to do this, please post here!) It looks like there might be some ways to get pytorch to run using multiple threads on CPU. Unfortunately I have never tried this, and don't know the benefits / limitations.
I'm guessing you are running on a CPU? Is that right?
Note to self: it is possible this is as simple as
torch.set_num_threads(n)
Test this.
Potentially useful example by Bo Li and collaborators here https://github.com/lilab-bcb/harmony-pytorch/blob/436e53cb3db1e51f3e23eb63fe2f3863e9e31f2d/harmony/harmony.py#L115
Thanks a lot for your work! Yes, we are using CPU as CUDA cannot be performed under our graphics card.
Unfortunately, I tested
import psutil
n_jobs = psutil.cpu_count(logical=False) # get physical cores
if n_jobs is None:
n_jobs = psutil.cpu_count(logical=True) # if undetermined, use logical cores instead
torch.set_num_threads(n_jobs)
and it doesn't make any difference in terms of runtime. So it's not that simple...
Will keep thinking...
Eh, okay, it's possible that using the number of logical cores is better. But I'm only seeing a speedup of like 4% :(
Still, I might include it as an input argument in the future. Default will use the number of logical cores.
Closed by #238
The input argument is --cpu-threads
. But limited testing shows me it unfortunately does not seem to make a big difference.
Hey @sjfleming,
Maybe off topic but it seems that the --cpu-threads
is causing a stall when writing out the output? I ran cellbender like this:
cellbender remove-background --cuda --estimator-multiple-cpu --cpu-threads 21 --epochs 40 --checkpoint-mins 60 --input {sample_input_counts} --output {sample_out_dir}
It gives me this output: cellbender:remove-background: Command: cellbender remove-background --cuda --estimator-multiple-cpu --cpu-threads 21 --epochs 40 --checkpoint-mins 60 --input /home/jovyan/data4/UCSD_Oxy-multiome/Oxy-multiome//FTL_702_M957/outs/raw_feature_bc_matrix.h5 --output /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/ cellbender:remove-background: CellBender 0.3.0 cellbender:remove-background: (Workflow hash 8a1259eac2) cellbender:remove-background: 2023-12-04 17:46:12 cellbender:remove-background: Running remove-background cellbender:remove-background: Loading data from /home/jovyan/data4/UCSD_Oxy-multiome/Oxy-multiome//FTL_702_M957/outs/raw_feature_bc_matrix.h5 cellbender:remove-background: CellRanger v3 format cellbender:remove-background: Features in dataset: 33294 Gene Expression, 96665 Peaks cellbender:remove-background: Trimming features for inference. cellbender:remove-background: 122701 features have nonzero counts. cellbender:remove-background: Prior on counts for cells is 4609 cellbender:remove-background: Prior on counts for empty droplets is 72 cellbender:remove-background: Excluding 11105 features that are estimated to have <= 0.1 background counts in cells. cellbender:remove-background: Including 111596 features in the analysis. cellbender:remove-background: Trimming barcodes for inference. cellbender:remove-background: Excluding barcodes with counts below 36 cellbender:remove-background: Using 10437 probable cell barcodes, plus an additional 24850 barcodes, and 49210 empty droplets. cellbender:remove-background: Largest surely-empty droplet has 126 UMI counts. cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpjiir3c4e cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpjiir3c4e /tmp/tmpjiir3c4e/8a1259eac2_params.pyro /tmp/tmpjiir3c4e/8a1259eac2_random.cuda /tmp/tmpjiir3c4e/8a1259eac2_optim.torch /tmp/tmpjiir3c4e/posterior.h5 /tmp/tmpjiir3c4e/8a1259eac2_args.npy /tmp/tmpjiir3c4e/8a1259eac2_optim.pyro /tmp/tmpjiir3c4e/8a1259eac2_model.torch /tmp/tmpjiir3c4e/8a1259eac2_train.loaderstate /tmp/tmpjiir3c4e/8a1259eac2_test.loaderstate /tmp/tmpjiir3c4e/8a1259eac2_random.pyro cellbender:remove-background: Loaded partially-trained checkpoint from ckpt.tar.gz cellbender:remove-background: Checkpoint loaded successfully. cellbender:remove-background: Running inference... cellbender:remove-background: 2023-12-04 17:47:38 cellbender:remove-background: Inference procedure complete. cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpheuaf78f cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpheuaf78f /tmp/tmpheuaf78f/8a1259eac2_params.pyro /tmp/tmpheuaf78f/8a1259eac2_random.cuda /tmp/tmpheuaf78f/8a1259eac2_optim.torch /tmp/tmpheuaf78f/posterior.h5 /tmp/tmpheuaf78f/8a1259eac2_args.npy /tmp/tmpheuaf78f/8a1259eac2_optim.pyro /tmp/tmpheuaf78f/8a1259eac2_model.torch /tmp/tmpheuaf78f/8a1259eac2_train.loaderstate /tmp/tmpheuaf78f/8a1259eac2_test.loaderstate /tmp/tmpheuaf78f/8a1259eac2_random.pyro cellbender:remove-background: Loaded pre-computed posterior from posterior.h5 cellbender:remove-background: 2023-12-04 17:48:53
cellbender:remove-background: Saved summary plots as /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/.pdf cellbender:remove-background: Saved cell barcodes in /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/_cell_barcodes.csv cellbender:remove-background: Computing target noise counts per gene for MCKP estimator cellbender:remove-background: Using MCKP noise targets computed for FPR 0.01 cellbender:remove-background: Computing denoised counts using mckp estimator cellbender:remove-background: Dividing dataset into chunks of genes
It seems to stall here, 21 threads are started, but they are using 0% CPU, and it does not write any output beyond the .pdf and the _cell_barcodes.tsv. I am about to try not setting the number of threads to see if it helps.
Hi there,
I have tried to find a way to limit or increase the number of threads that is using, but have not found a result. Any argument that can be provided?
Thank you very much.
Angela