theislab / scarches

Reference mapping for single-cell genomics
https://docs.scarches.org/en/latest/
BSD 3-Clause "New" or "Revised" License
323 stars 50 forks source link

scGen model is very slow #142

Open joseph-siefert opened 1 year ago

joseph-siefert commented 1 year ago

Thanks for the great tools! Is the scGen model in scArches using the newest scGen version? I can run scGen no problem and it runs pretty fast, but when I run the scGen model with scArches is very slow. I would like to run with scArches so I can map query data onto the reference after correcting batch effects in the reference.

#runs fine
scgen.SCGEN.setup_anndata(adata, batch_key="dataset", labels_key="cell_type")
model = scgen.SCGEN(adata)
model.train(
    max_epochs=100,
    batch_size=32,
    early_stopping=True,
    early_stopping_patience=25,
)
#super slow
epoch = 50
early_stopping_kwargs = {
    "early_stopping_metric": "val_loss", #I have also tried elbo_metric and still very slow
    "patience": 20,
    "threshold": 0,
    "reduce_lr": True,
    "lr_patience": 13,
    "lr_factor": 0.1,
}
network = sca.models.scgen(adata = source_adata , hidden_layer_sizes=[256,128])
network.train(n_epochs=epoch, early_stopping_kwargs=early_stopping_kwargs, use_gpu=True)

Am I missing something, or is the scGen model in scArches not optimal?

M0hammadL commented 1 year ago

Hi thanks for you trying it, which step is specifically slower?

joseph-siefert commented 1 year ago

The training

M0hammadL commented 1 year ago

It could be since they are basically two different implementations. Are trying on cpu or gpu?

joseph-siefert commented 1 year ago

GPU

On Thu, Nov 10, 2022 at 20:04 M0hammadL @.***> wrote:

It could be since they are basically two different implementations. Are trying on cpu or gpu?

— Reply to this email directly, view it on GitHub https://github.com/theislab/scarches/issues/142#issuecomment-1311203065, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGL4NCIVO5V4P7OTSFMS73TWHXAU5ANCNFSM6AAAAAAR5CNZNM . You are receiving this because you authored the thread.Message ID: @.***>

M0hammadL commented 1 year ago

Have you check if the model is using gpu?

@alextopalova could you plz check this out? And open a Issue for this code to add scgen in theislab repo (theislab/scgen) imported here? Only adaption would be writing a function of fir batch correction and reference mapping which is implemented here which needs to be adapted to that version

joseph-siefert commented 1 year ago

I'm not sure how to check. When I run scGen I get the output: GPU available: True (cuda), used: True When I run scArches with scGen model I do not get the same output. I do set the use_gpu=True flag, but not sure how to verify that it is actually using GPU.

Inspecting the resulting model shows that scArches uses the vaearith model of scGen. I don't see this stated explicitly when using scGen directly, but perhaps I don't know where this information is stored.

UPDATE: I believe it's running on CPU. I've monitored GPU usage from another notebook, and as far as I can tell it is not using GPU. Also, when I start the process the CPU usage goes from 0 to 100%. I can run scGEN with GPU in this same notebook, so it's not the environment. I also have enabled the use_gpu=True flag, but still scArches does not seem to be using GPU for the scGEN model

torch.cuda.is_available()
True

I can even see the process logged on a CudaDevice, but GPU utilization is 0% and CPU is 100%

joseph-siefert commented 1 year ago

I appears this may have to do with the size of the matrix. If I use a very small subset the GPU utilization is slightly higher (10-12% max) and if I used a larger subset GPU utilization is around 1-2% max. CPU utilization is still quite high intermittently, but it will finish in a reasonable amount of time. For a very large matrix the time is too long (several days). Seems that either the CPU step is causing a bottleneck for very large matrices, or the data loading to GPU is not optimal.

M0hammadL commented 1 year ago

It should be data training then which seems to be not supper efficient, we will work on it but that will take some time. Happy to merge if you have a pr here

kotr98 commented 6 months ago

any updates on this?