nolanlab / citrus

Citrus Development Code
GNU General Public License v3.0
31 stars 20 forks source link

Program keeps crashing R! #110

Closed AbbyKimball closed 6 years ago

AbbyKimball commented 7 years ago

Hello! I've tried this a couple of times and R keeps unexpectedly closing. Is this a common problem?

AbbyKimball commented 7 years ago

Here is the code I have, I followed the directions on the wiki:

install.packages("glmnet") install.packages("pamr") install.packages("ggplot2") install.packages("ggplot2") install.packages("Rclusterpp") source("http://bioconductor.org/biocLite.R") biocLite("flowCore") biocLite("impute") install.packages("samr") install.packages("shiny") install.packages("brew") install.packages("devtools") library("devtools") install_github('nolanlab/citrus') library("citrus") citrus.launchUI() g++-mp-4.8 -v nano ~/.R/Makevars install.packages("Matrix") install.packages(c("Rcpp","RcppEigen","Rclusterpp"),type="source") library("devtools") install_github('nolanlab/citrus') library("citrus") citrus.launchUI() library("citrus") citrus.launchUI()

You see it begins, but seems to get stuck on the clustering part:

screen shot 2017-02-27 at 4 19 54 pm

And then I get:

screen shot 2017-02-27 at 4 27 35 pm
SamGG commented 7 years ago

Hi, I would try to decrease the number of events before the clustering step. 40 000 events sounds better for testing. HTH

AbbyKimball commented 7 years ago

Thanks for the reply :)

From playing around on cytobank it seems like lowering your events results in different CITRUS results. Do you know if these differences are significant?

AbbyKimball commented 7 years ago

I've tried it now with 5,000 events per sample and 1,000 events per sample. Still crashes.

rbruggner commented 7 years ago

Hi - sorry you're experiencing crashing. Could you tell me what version of Mac OS X you're using?

SamGG commented 7 years ago

Robert is definitevily the guru for Macintosh issues. Concerning the sampling, here is my opinion. Lowering too much the number of events will lead to some kind of loss of resolution: a real effect might be unseen. I haven't done any test concerning this aspect yet. What I understand from the Citrus algorithm is the following. Let's say we are interested in comparing condition A vs B, each made of 8 or more biological samples (this is the simplest example). Citrus builds a hierarchical clustering of sampled events from FCS files of condition A and B. In the resulting tree, Citrus will then consider only nodes with more than a defined percentage of events. This user defined threshold is applied to the FCS file merging events of conditions A and B. Therefore, if a population is 2% in condition A and 0% in condition B, the percentage in the mix will be 1%. I would then recommand to set the threshold less than 1% for such a population. Let's go with 0.5%. For the nodes of the hierarchical tree that aggregated more then 0.5%, Citrus will test if the percentage of events is statistically different in condition A versus B. If the threshold was set to 2%, the population of interest will certainly be aggregated with events of a different profile, leading to an inhomogeneous cluster of events. Due to the mixing of events, the difference in percentage might be statistically not significant. This sounds like a loss of resolution. IMHO:

AbbyKimball commented 7 years ago

Robert, I looked over the issue you sent along and I added the code you provided for potential memory leak:

install.packages(c("Rcpp","RcppEigen"),type="source") library("devtools") install_github('nolanlab/Rclusterpp') install_github('nolanlab/citrus')

I am on macOS Sierra Version 10.12.1

Sam, per your recommendation here, and on the linked issue I first lowered the Minimum Cluster Size to 1 (the program still crashed), and then to .5 and the program still crashed.

I read somewhere that having uneven samples (4 in one group and 5 in another for this case) could affect the algorithm, is this really true?

I've attached files of the set up.

screen shot 2017-03-01 at 12 50 37 pm screen shot 2017-03-01 at 12 50 08 pm screen shot 2017-03-01 at 12 49 57 pm screen shot 2017-03-01 at 12 50 15 pm screen shot 2017-03-01 at 12 48 59 pm
rbruggner commented 7 years ago

Abby, does Citrus still crash after you've installed Rclusterpp from source, e:g:

R> library("devtools")
R> install_github('nolanlab/Rclusterpp')
SamGG commented 7 years ago

Dear Abby, I really think that the problem is due to some compilation issue. Parameter setup should not lead to a crash. I think Robert could also explain you how to run the script (created by the Shiny interface) from command line. Please do find a simple code that may help to understand the faulty code. This code should lead to the following graphics.

library(Rclusterpp)
# Matrix size
nc = 20
nr = 10000
# Generate some random data
set.seed(123)
some.data = matrix(rnorm(nc*nr), ncol = nc)
# Define a population with a different mean value
some.data[1:1000, 1:(nc %/% 4)] = some.data[1:1000, 1:(nc %/% 4)] + 1
# Do clustering
hc = Rclusterpp.hclust(some.data)
hc
# Optional process: cut the hierarchical tree in 10 clusters
hc.cut.members = cutree(hc, k = 10)
hc.cut.mean = apply(some.data, 2, function(x) tapply(x, hc.cut.members, mean))
hc.cut.mean = as.matrix(hc.cut.mean)
plot(hclust(dist(hc.cut.mean)))
# Alternative display
image(some.data)
image(hc.cut.mean)

rplot