hms-dbmi / scde

R package for analyzing single-cell RNA-seq data
http://pklab.med.harvard.edu/scde
Other
172 stars 66 forks source link

core dump error #8

Closed lpantano closed 9 years ago

lpantano commented 9 years ago

Hi!

thanks for this great package! I am trying to do the padoga vignette but I find this problem in this line:

varinfo <- pagoda.varnorm(knn, counts = cd, trim = 3/ncol(cd), max.adj.var = 5, n.cores = 1, plot = TRUE)

the error:

error: Mat::init(): requested size is not compatible with row vector layout

I see travis compilation shows the same but in other step.

Do you know what could be wrong?

my R is 3.2.1

thanks for your time in advance

JEFworks commented 9 years ago

Hi Lorena,

Thanks for trying it out!

I saw that error in the travis compilation but haven't been able to replicate it on my computer. It has something to do with the C code being ported via Rcpp/RcppArmadillo and how that gets compiled I believe. Can you run the sessionInfo() command for me and paste the output?

Thanks for the help!

Best, Jean

lpantano commented 9 years ago

thanks for helping! here it is

> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Ubuntu 15.04

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] scde_1.99.0     flexmix_2.3-13  lattice_0.20-31 knitr_1.10.5   

loaded via a namespace (and not attached):
 [1] Rcpp_0.11.6               RColorBrewer_1.1-2        nloptr_1.0.4              futile.logger_1.4.1       futile.options_1.0.0     
 [6] tools_3.2.1               Lmoments_1.1-6            lme4_1.1-8                nlme_3.1-120              mgcv_1.8-6               
[11] Matrix_1.2-1              parallel_3.2.1            SparseM_1.6               RcppArmadillo_0.5.200.1.0 extRemes_2.0-5           
[16] stats4_3.2.1              grid_3.2.1                nnet_7.3-9                Biobase_2.29.1            distillery_1.0-1         
[21] Rook_1.1-1                BiocParallel_1.3.34       limma_3.25.13             minqa_1.2.4               lambda.r_1.1.7           
[26] car_2.0-25                edgeR_3.11.2              pcaMethods_1.59.0         modeltools_0.2-21         MASS_7.3-40              
[31] BiocGenerics_0.15.3       splines_3.2.1             RMTstat_0.3               pbkrtest_0.4-2            quantreg_5.11            
[36] brew_1.0-6                rjson_0.2.15              Cairo_1.5-6           
JEFworks commented 9 years ago

Hi Lorena,

The issue has been fixed. Details can be found in the latest commit message if you're interested https://github.com/hms-dbmi/scde/commit/166bff8fb61401f1c2590702ed0c96b9445f3b87

scde is also now passing travis testing.

Please install this latest version and let me know if you run into any other trouble! Thanks!

Best, Jean

lpantano commented 9 years ago

Hi, thanks so much for this!

sadly I got another core dump:

> x <- gsub("^Hi_(.*)_.*", "\\1", colnames(cd))
> l2cols <- c("coral4", "olivedrab3", "skyblue2", "slateblue3")[as.integer(factor(x, levels = c("NPC", "GW16", "GW21", "GW21+3")))]
> data(knn)
> varinfo <- pagoda.varnorm(knn, counts = cd, trim = 3/ncol(cd), max.adj.var = 5, n.cores = 1, plot = TRUE)

error: sort_index(): detected non-finite values

terminate called after throwing an instance of 'std::logic_error'
  what():  sort_index(): detected non-finite values
Aborted (core dumped)

I reinstalled everything, even removed from R library and clone again the repository. It compiles again, but i got another error this time. This is for the pagoda.varnorm function.

let me know if i can do something else to help you debugging. I will try another computer as well.

Thanks again!

JEFworks commented 9 years ago

Hi Lorena,

Hum, I've actually never seen that one before.

I assume knn are the error models derived from your counts matrix. Could you check if any of your cells failed the error modeling step using:

# filter out cells that don't show positive correlation with
# the expected expression magnitudes (very poor fits)
valid.cells <- knn$corr.a > 0
table(valid.cells)

Also, it would help to just double check if any of the counts in your counts matrix are indeed finite integers using is.nan is.infinite etc.

Thanks!

Best, Jean

lpantano commented 9 years ago

I forgot to mention i am doing the vignette example. So it's not my data:

> library(scde)
Loading required package: flexmix
Loading required package: lattice
>   # read in the expression matrix
>   data(pollen)
>   cd <- pollen
>   # filter data
>   # filter out low-gene cells
>   cd <- cd[, colSums(cd>0)>1.8e3]
>   # remove genes that don't have many reads
>   cd <- cd[rowSums(cd)>10, ]
>   # remove genes that are not seen in a sufficient number of cells
>   cd <- cd[rowSums(cd>0)>5, ]
>   # check the final dimensions of the read count matrix
>   dim(cd)
[1] 11310    64
> x <- gsub("^Hi_(.*)_.*", "\\1", colnames(cd))
> l2cols <- c("coral4", "olivedrab3", "skyblue2", "slateblue3")[as.integer(factor(x, levels = c("NPC", "GW16", "GW21", "GW21+3")))]
> data(knn)
> varinfo <- pagoda.varnorm(knn, counts = cd, trim = 3/ncol(cd), max.adj.var = 5, n.cores = 1, plot = TRUE)

Here it's when i got the error.

suggested by you:

> valid.cells <- knn$corr.a > 0
> table(valid.cells)
valid.cells
TRUE 
  64 

thanks!, will continue digging.

lpantano commented 9 years ago

maybe is that:

After that command in the padoga.varnom function, fpm has non infinite numbers

fpm <- t((t(log(cd))-models$corr.b)/models$corr.a)

this gets to winsorize.matrix and breaks with the error.

could it be that?

JEFworks commented 9 years ago

Ah, that's definitely it. Thanks so much for the catch. I believe the log is missing a + 1 else all the 0s in cd will become -Inf

Let me just check the commit history and see when this may have been changed and how my previous runs didn't encounter this error. Thanks again for your help!

JEFworks commented 9 years ago

Hum, I'm using an older version of RcppArmadillo_0.4.650.1.1 that seems to have a sort_index that does handle the infinite values by treating then as an arbitrary large number (and also returns sort_index as a row instead of single column; so I think this bug fix actually now introduces the error: Mat::init(): requested size is not compatible with row vector layout for my older version of RcppArmadillo)

Let me see what changed from RcppArmadillo_0.4.650.1.1 to RcppArmadillo_0.5.200.1.0

lpantano commented 9 years ago

oh! ok. no problem. I am seeing that this year there are a bunch of changes in R and BioC that are affecting a lot of tools. Hopefully everything will be ok in the next version when everybody can install the stable version of everything. thanks again for your time

JEFworks commented 9 years ago

Hi Lorena,

I've confirmed this is an issue with RcppArmadillo. I've opened an issue with them. I'm not sure how long they will take to resolve it, so for now, I would actually recommend using the older version of RcppArmadillo (pre v0.5) with the released version of scde just since everything was tested out there with RcppArmadillo_0.4.650.1.1. I don't know if there's a way for me to require an older version of a dependency in the package Description (I've only heard of requiring newer version).

Hopefully that'll fix things! Thanks again for your help on this!

Best, Jean

lpantano commented 9 years ago

Thanks! I would do that until RcppArmadillo gets more stable!