Open aksarkar opened 1 year ago
@aksarkar I believe this is already implemented with the nc
control argument in de_analysis
; see "Details" in help(de_analysis)
.
I see. I missed this in the documentation since I was expecting it to be in the function arguments, not in control
.
However, when I tried it on my data set (~180K x 100K scATAC-seq count matrix), the process appears to only use one core and hang.
From what I recall, one issue is that the parallel implementation is pretty memory-hungry (I tried to fix this, but did not have any luck). These were my settings for a scRNA-seq data set with ~90,000 cells. You might try first with a small number of samples, say ns = 1000
. What is K here? My guess is that your run only used one core because it didn't hung before it got to the multithreading part.
I think the issue is that dat
is copied to every thread, and the memory usage could be drastically reduced if pblapply
is taken over cols
instead.
https://github.com/stephenslab/fastTopics/blob/master/R/lfc.R#L107-L121
Memory usage is an issue with de_analysis
, as you correctly point out, but I think the issue is more fundamentally with mclapply (which is used by pblapply
). Previously, I was using parLapply
, which avoids these issues (and has the advantage that it is more platform-independent), but I ran into other unexpected issues with parLapply
, so I ended up switching to mclapply
. Certainly, some improvements here are warranted and I'm open to suggestions.
@pcarbo After digging into #37 and the details of mclapply
, I think the memory usage issue is fundamentally because mclapply
forks subprocesses, which copies everything in memory to every process.
There is still room to improve the total memory usage, although it's a bit difficult to profile (probably would require something like docker stats
).
There is still room to improve the total memory usage.
I agree 100%.
Thank you for the great package! I am also failing to run DE on a larger dataset (~31000 x 35000) matrix. I am using the 'better-multithread' brench. Even on HPC cluster its still not passing 0%. Is there any workaround to run it?
@hlszlaszlo Is your matrix a sparse matrix ("dgCMatrix")? Could you please share your exact de_analysis()
call?
Was running as a regular marix class.
Now running with counts as "dgCMarix" class and seems much faster, thank you 🙂
Command: de_analysis(fit, counts, pseudocount = 0.1, control = list(ns = 1e4, nc = 25))
I'm glad to hear that helped.
When you are using nc = 25
, also make sure that you have requested at least 25 cores in your job.
Yes, and I see using 25 cores. The sparse matrix helped. I can even run with less core on personal computer.
Bumping up the priority of this issue based on some recent conversations with researchers trying to run de_analysis
on a large data set.
Testing for differential expression/accessibility is embarrassingly parallel, and could use all available cores by default.