kieranrcampbell / ouija

Descriptive probabilistic marker gene approach to single-cell pseudotime inference
http://kieranrcampbell.github.io/ouija
28 stars 3 forks source link

Neverending sampling chain 1, cycling CPU #7

Open cemalley opened 6 years ago

cemalley commented 6 years ago

Hi Kieran, Thanks for developing ouija. I'm testing it out on a complete 1386 cell x 28000 gene matrix of single cell RNASeq counts. I tested 200GB-1T memory and 1-4 CPUs. It seems to use a steady 600GB memory and cycles between 1 and 2 CPU. With the code from the readme, it has not finished (converged?) in over a day. Is the matrix too big for ouija?

I also notice there are lots of warnings before it says "SAMPLING FOR MODEL 'ouija' NOW (CHAIN 1)", and no other notices beyond that. Thanks for your advice.

library(ouija) library(Seurat) load("Seurat.Object.RData") options(mc.cores = parallel::detectCores()) oui <- ouija(as.matrix(seurat@data))

ouija

kieranrcampbell commented 6 years ago

Hi @cemalley

Ouija is primarily designed to be used with a small number of marker genes (paper is here) that are a priori known to be involved in the process of interest. So your options are

  1. Use this small gene set if known
  2. Select ~ top 100 highly variable genes otherwise
  3. Consider using inference_type = 'vb' to use much faster variational bayes rather than HMC sampling
  4. Think about using ouijaflow that should be much faster and work with more genes, though I'd still suggest using the 100-500 most variable (in log expression space) genes as input

Also consider subsampling cells to ~ 200 while you figure out the best range of options.

Hope that helps?

Kieran