Closed zhongguojie1998 closed 3 years ago
How many pathways are you trying to model? And how much memory do you have available?
Hi, I am trying to model the c5.go.bp pathways, which has 7481 pathways in total. I guess after filtering it will result in ~5000 pathways in my data. The memory I have available is 24GB.
The current implementation is optimized for speed and considers a few hundred pathways. We will consider a low memory version for a future release
I understand, thank you!
Hi @StefanGStark, I am considering using the lambdalabs cloud service for training, could you estimate how much RAM and time it would require for 30k cells, 20k genes and 2875 pathways? Thank you very much for your help!
You can find details of how we implemented the current version in section 2.2 of the preprint. If you keep the same architecture used in the paper (L=1, h_0=12, z=4 and your parameters, K=2875, g=20k) then this can easily be over 100G. Again the scale of this problem is not something we have addressed. Here we opted to trade inefficient physical memory for speed (we found it to be much faster to use large dense matrices on a GPU). It's unclear if hyper parameters like learning rate and beta will be reusable for your setting. We suggest that you take some more stringent filtering strategy, e.g. filter genes by some highly variable criteria and pathways such that the minimum pathway size is around 20.
Thank you!
Hi I am running your package and find the train process will cause OOM error on GPU. Could you please look into that? Attached here is my train.log
Thank you! train.log