train loss will cause OOM error

ratschlab / pmvae

Code for pmVAE model, seen in ICML CompBio '21

MIT License

12 stars 0 forks source link

train loss will cause OOM error #4

Closed zhongguojie1998 closed 3 years ago

zhongguojie1998 commented 3 years ago

Hi I am running your package and find the train process will cause OOM error on GPU. Could you please look into that? Attached here is my train.log

Thank you! train.log

stefangstark commented 3 years ago

How many pathways are you trying to model? And how much memory do you have available?

zhongguojie1998 commented 3 years ago

Hi, I am trying to model the c5.go.bp pathways, which has 7481 pathways in total. I guess after filtering it will result in ~5000 pathways in my data. The memory I have available is 24GB.

stefangstark commented 3 years ago

The current implementation is optimized for speed and considers a few hundred pathways. We will consider a low memory version for a future release

zhongguojie1998 commented 3 years ago

I understand, thank you!

zhongguojie1998 commented 3 years ago

Hi @StefanGStark, I am considering using the lambdalabs cloud service for training, could you estimate how much RAM and time it would require for 30k cells, 20k genes and 2875 pathways? Thank you very much for your help!

stefangstark commented 3 years ago

You can find details of how we implemented the current version in section 2.2 of the preprint. If you keep the same architecture used in the paper (L=1, h_0=12, z=4 and your parameters, K=2875, g=20k) then this can easily be over 100G. Again the scale of this problem is not something we have addressed. Here we opted to trade inefficient physical memory for speed (we found it to be much faster to use large dense matrices on a GPU). It's unclear if hyper parameters like learning rate and beta will be reusable for your setting. We suggest that you take some more stringent filtering strategy, e.g. filter genes by some highly variable criteria and pathways such that the minimum pathway size is around 20.

zhongguojie1998 commented 3 years ago

Thank you!