Open dtm2117 opened 6 years ago
On Sep 14, 2017, at 10:41 AM, dtm2117 notifications@github.com wrote:
Hello,
I am trying to run the error modeling step on a sample with thousands of cells. I've increased K and min.nonfailed. But I find that even if I set n.cores = 1 , I have threads running on every core of the cluster. Furthermore I've never had the error modeling step finish.
Any thoughts on this issue?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hms-dbmi/scde/issues/51, or mute the thread https://github.com/notifications/unsubscribe-auth/ALT78h8XRbxBgUwdKoP_15e2RofPAhw6ks5siTsXgaJpZM4PXsYB.
Here is the command: knn <- knn.error.models(cd_new_nodup, k = ncol(cd)/2, n.cores = 12, min.count.threshold = 1, min.nonfailed = 20, max.model.plots = 10)
dim of the matrix is ~ 13k genes but 4k cells. I've filtered out any genes that have no expression.
This is UMI data also
For the runtime issue, I think k needs to be lowered considerably. To something like 50 or a 100. It just needs sufficient number of neighboring cells to calculate the few parameters for the error model. 2k cells would be definitely an overkill for that. Not sure about the number of cores though. Can you see if doing something like pagoda:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12) also uses too many cores? Best, -peter.
On Sep 21, 2017, at 12:12 PM, dtm2117 notifications@github.com wrote:
Here is the command: knn <- knn.error.models(cd_new_nodup, k = ncol(cd)/2, n.cores = 12, min.count.threshold = 1, min.nonfailed = 20, max.model.plots = 10)
dim of the matrix is ~ 13k genes but 4k cells. I've filtered out any genes that have no expression.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hms-dbmi/scde/issues/51#issuecomment-331205975, or mute the thread https://github.com/notifications/unsubscribe-auth/ALT78gYZAjHmjdofH7JKoK8acS3Bxrmuks5skorxgaJpZM4PXsYB.
Ok, I will try this. On the scde help page it says that k may need to be increased for 1000s of cells, which is why I kept the denominator low.
Can't run that command because the pagoda library is not installed apparently. I thought it was installed along with SCDE package?
When running scde:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12) It finishes in about 2 seconds, and can't tell the core usage.
Yes, I meant scde package. You'd need to increase the rnorm argument to take some sizeable amount of time.
-peter.
On Sep 21, 2017, at 12:37, dtm2117 notifications@github.com wrote:
When running scde:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12) It finishes in about 2 seconds, and can't tell the core usage.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
after increasing rnorm, it seems to be running on 12 cores only.
While the scde:::papply( 1:1e2, function(x) rnorm(1e3), n.cores=12) runs on only 12 cores, the knn parameter still uses all cores.
Any ideas on why the discrepancy?
Hello,
I am trying to run the error modeling step on a sample with thousands of cells. I've increased K and min.nonfailed. But I find that even if I set n.cores = 1 , I have threads running on every core of the cluster. Furthermore I've never had the error modeling step finish.
Any thoughts on this issue?