Too long runtime for GoGAPS R

LiuCanidk commented 4 months ago

I tried to run CoGAPS in a relatively large single cell dataset (35412 cells * 50000+ genes)

And the minumum running time for me (nPatterns, i.e., k=5, 6, 7, 8, 9, 10) was unacceptable, with "sparseOptimization=True, nSets=20". As shown below, only k=5 needs 2600+h, more than 100 days! And k=11 needs even more, ~4000h!

Did I miss something that can speed up parallelization? Or the would pyCoGAPS be much faster? (I notice in the Nature Protocol manuscript, pyCoGAPS just has a slight increase in speed performance)

Any suggestions would be greatly appreciated!

dimalvovs commented 4 months ago

Hi @LiuCanidk, the output that is shown above corresponds to standard CoGAPS, in the case of distributed run using nSets it should report the distributed params before the run, something like this:

-- Distributed CoGAPS Parameters -- 
nSets          6 
cut            5 
minNS          3 
maxNS          9

dimalvovs commented 4 months ago

Closing as solved, please reopen if needed.

FertigLab / CoGAPS

Too long runtime for GoGAPS R #96