scverse / scvi-tools

Deep probabilistic analysis of single-cell and spatial omics data
http://scvi-tools.org/
BSD 3-Clause "New" or "Revised" License
1.16k stars 341 forks source link

scvi tools train with multi gpus #2726

Open shanzha9 opened 2 months ago

shanzha9 commented 2 months ago

Hi, developers,

The datasets used in my study consisting of 2.6 M cells, so it will take a week to train. I wonder if scvi tools support multi gpus train and if there is an official tutorial.

Thank you!

canergen commented 2 months ago

There is multi-GPU support. However, your calculation looks vastly off and I don't think your dataset is large enough to benefit largely from multi-GPU. E.g. on the human CELLXGENE census dataset of 35 million cells, training for 100 epochs took less than 2 days (we actually had strikingly similar results after a couple of hours and 20 epochs). I would recommend increasing batch_size to 1024 (time scales pretty linearly with batch size) and reducing train epochs to 50 (it's what I am usually using) and you should be able to train it in 2 hours. Let me know if it takes more than 8 hours (installation might be wrong or your object is not correctly formatted).

shanzha9 commented 2 months ago

@canergen Hi, thanks for your reply.

Could you please link the official multi gpus tutorial. The data to be train is raw counts. I set

scvi.settings.dl_num_workers = 30 scvi.settings.num_threads = 30 scvi.settings.batch_size = 2048

but, the %cpu only under 20% per process. There may somethingwrong with dataloader?