alexisvdb / singleCellHaystack

Finding surprising needles (=genes) in haystacks (=single cell transcriptome data).
https://alexisvdb.github.io/singleCellHaystack/
Other
78 stars 9 forks source link

problem when running haystack #23

Closed RuiGao-1223 closed 1 year ago

RuiGao-1223 commented 1 year ago

Hello! Thank you for providing such a convenient analysis tool! But I meet some problems when using singleCellHaystack. When I run res.pc20 <- haystack(x = dat.pca, expression = dat.expression) of this tutorial https://alexisvdb.github.io/singleCellHaystack/articles/examples/a02_example_scRNAseq.html using the provided example data, I received an error: the parameter is not valid. Also, I received the same error when using my own data.

飞书20221214-164741

My running environment is R 4.2.1, and all the dependency packages have been installed. Looking forward to your reply. Thanks a lot!

ddiez commented 1 year ago

Hi @RuiGao-1223 thanks for reaching out.

I just ran the example data without problem:

> res <- haystack(dat.tsne, dat.expression)
### calling haystack_continuous_highD()...
### Using package sparseMatrixStats to speed up statistics in sparse matrices.
### Calculating row-wise mean and SD... 
### Filtered 0 genes with zero variance...
### Using 100 randomizations...
### Using 100 genes to randomize...
### scaling input data...
### deciding grid points...
### calculating Kullback-Leibler divergences...
  |================================================================================================================================================================================================================================| 100%
### performing randomizations...
  |================================================================================================================================================================================================================================| 100%
### estimating p-values...
### picking model for mean D_KL...
### using natural splines
### best RMSD  : 0.088
### best df    : 3
### picking model for stdev D_KL...
### using natural splines
### best RMSD  : 0.019
### best df    : 4
### returning result...
Warning message:
In haystack_continuous_highD(x, expression = expression, weights.advanced.Q = weights.advanced.Q,  :
  The value of 'grid.points' appears to be very high (> No. of cells / 10). You can set the number of grid points using the 'grid.points' parameter.

Did you install singleCellHaystack from CRAN or github? What version do you have? If you want to run the recently released continuous version make sure to have the latest version installed from github with remotes::install_github("alexisvdb/singleCellHaystack").

ddiez commented 1 year ago

Sorry just noticed you used a different example than the toy dataset I used above. With the data from the vignette you mention I still have no issues:

> res <- haystack(dat.pca, dat.expression)
### calling haystack_continuous_highD()...
### Using package sparseMatrixStats to speed up statistics in sparse matrices.
### Calculating row-wise mean and SD... 
### Filtered 0 genes with zero variance...
### Using 100 randomizations...
### Using 100 genes to randomize...
### scaling input data...
### deciding grid points...
### calculating Kullback-Leibler divergences...
  |================================================================================================================================================================================================================================| 100%
### performing randomizations...
  |================================================================================================================================================================================================================================| 100%
### estimating p-values...
### picking model for mean D_KL...
### using natural splines
### best RMSD  : 0.037
### best df    : 3
### picking model for stdev D_KL...
### using natural splines
### best RMSD  : 0.023
### best df    : 7
### returning result...

Also, I noticed from the screenshot that the error points to haystack_highD. That function is the binary version, not the continuous version that we use in the vignette, suggesting that you are using an outdated version of the package.

RuiGao-1223 commented 1 year ago

Thanks a lot for your reply! As you said, I reinstalled the latest version of the package. But I still received the same error. Sorry to say that I am new to R, so this error confused me a lot. 20221214173615

ddiez commented 1 year ago

Mmm I see. Did you have singleCellHaystack installed before?

Could you share with us the output in the console that you get when you install the package? You can copy/paste the output into the commenting box in github. If you can, put it within a code format (using the <> icon in the format bar). If you don't know how to do that sharing the screenshot is also ok.

RuiGao-1223 commented 1 year ago

Yeah. I removed the previous version by remove.packages("singleCellHaystack"), and then install the latest version using remotes::install_github("alexisvdb/singleCellHaystack").

RuiGao-1223 commented 1 year ago

This is the output when I install the package:

> remotes::install_github("alexisvdb/singleCellHaystack")
Downloading GitHub repo alexisvdb/singleCellHaystack@HEAD
✔  checking for file ‘/tmp/Rtmp4Mug7b/remotes296965e27c1d1/alexisvdb-singleCellHaystack-b4f2aed/DESCRIPTION’ ... OK
─  preparing ‘singleCellHaystack’:
✔  checking DESCRIPTION meta-information
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  building ‘singleCellHaystack_0.99.0.tar.gz’

将程序包安装入‘/home/rui/R/x86_64-pc-linux-gnu-library/4.2’
(因为‘lib’没有被指定)
* installing *source* package ‘singleCellHaystack’ ...
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (singleCellHaystack)
ddiez commented 1 year ago

Can you please post the output of sessionInfo() after running the failing code?

RuiGao-1223 commented 1 year ago
> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=zh_CN.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=zh_CN.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=zh_CN.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] remotes_2.4.2             singleCellHaystack_0.99.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9        pillar_1.8.1      compiler_4.2.1    plyr_1.8.8        prettyunits_1.1.1
 [6] tools_4.2.1       pkgbuild_1.3.1    lifecycle_1.0.2   tibble_3.1.8      gtable_0.3.1     
[11] lattice_0.20-45   pkgconfig_2.0.3   rlang_1.0.6       Matrix_1.5-1      DBI_1.1.3        
[16] cli_3.4.1         rstudioapi_0.14   curl_4.3.2        withr_2.5.0       dplyr_1.0.10     
[21] stringr_1.4.1     generics_0.1.3    vctrs_0.5.1       rprojroot_2.0.3   grid_4.2.1       
[26] tidyselect_1.1.2  glue_1.6.2        R6_2.5.1          processx_3.7.0    fansi_1.0.3      
[31] ggplot2_3.3.6     reshape2_1.4.4    purrr_0.3.4       callr_3.7.2       magrittr_2.0.3   
[36] ps_1.7.1          scales_1.2.1      splines_4.2.1     assertthat_0.2.1  colorspace_2.0-3 
[41] utf8_1.2.2        stringi_1.7.8     munsell_0.5.0     crayon_1.5.2 
ddiez commented 1 year ago

I am not sure what's going on but to make sure you are using the latest code I updated the version of the package to 0.99.1. Can you please reinstall from github, try again the example code and copy/paste the error and the output from sessionInfo()?

RuiGao-1223 commented 1 year ago

After reinstalling the package, this is the output of sessionInfo():

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=zh_CN.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=zh_CN.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=zh_CN.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] remotes_2.4.2             singleCellHaystack_0.99.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9        pillar_1.8.1      compiler_4.2.1    plyr_1.8.8        prettyunits_1.1.1
 [6] tools_4.2.1       pkgbuild_1.3.1    lifecycle_1.0.2   tibble_3.1.8      gtable_0.3.1     
[11] lattice_0.20-45   pkgconfig_2.0.3   rlang_1.0.6       Matrix_1.5-1      DBI_1.1.3        
[16] cli_3.4.1         rstudioapi_0.14   curl_4.3.2        withr_2.5.0       dplyr_1.0.10     
[21] stringr_1.4.1     generics_0.1.3    vctrs_0.5.1       rprojroot_2.0.3   grid_4.2.1       
[26] tidyselect_1.1.2  glue_1.6.2        R6_2.5.1          processx_3.7.0    fansi_1.0.3      
[31] ggplot2_3.3.6     reshape2_1.4.4    purrr_0.3.4       callr_3.7.2       magrittr_2.0.3   
[36] ps_1.7.1          scales_1.2.1      splines_4.2.1     assertthat_0.2.1  colorspace_2.0-3 
[41] utf8_1.2.2        stringi_1.7.8     munsell_0.5.0     crayon_1.5.2  

And sadly, I received the same error as before when using the example code.

> res.pc20 <- haystack(x = dat.tsne, expression = dat.expression)
Error in haystack_highD(x, detection = detection, use.advanced.sampling = use.advanced.sampling,  : 
  参数没有用(expression = c(2.08193929719064, 0, 1.50629793178557, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 0, 1.50629793178557, 0, 1.50629793178557, 0, 0, 0, 0, 3.5862938376385, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.4449532475512, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.50629793178557, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.92047073514559, 0, 1.50629793178557, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
1.50629793178557, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 0, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 1.50629793178557, 0, 2.08193929719064, 2.08193929719064, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.50629793178557, 1.50629793178557, 0, 0, 1.50629793178557, 0, 2.08193929719064, 1.50629793178557, 0, 0, 0, 0, 0, 0, 0, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 2.4449532475512, 0, 0, 0, 0, 2.71071425203283, 0, 0, 0, 1.50629793178557, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.50629793178557, 0, 3.586
ddiez commented 1 year ago

OK, can you post the output of singleCellHaystack:::haystack.matrix? Can you also post the output of ls()? I want to make sure what's in your environment.

ddiez commented 1 year ago

Also, could you run the following code and copy/paste the output?

pkgs <- installed.packages()
pkgs[pkgs[, "Package"] == "singleCellHaystack", , drop=FALSE]
RuiGao-1223 commented 1 year ago
> pkgs <- installed.packages()
> pkgs[pkgs[, "Package"] == "singleCellHaystack", , drop=FALSE]
                   Package              LibPath                                      
singleCellHaystack "singleCellHaystack" "/home/rui/R/x86_64-pc-linux-gnu-library/4.2"
                   Version  Priority Depends Imports                                      
singleCellHaystack "0.99.1" NA       NA      "methods, Matrix, splines, ggplot2, reshape2"
                   LinkingTo
singleCellHaystack NA       
                   Suggests                                                                                                                                               
singleCellHaystack "knitr, rmarkdown, testthat, SummarizedExperiment,\nSingleCellExperiment, SeuratObject, cowplot, wrswoR,\nsparseMatrixStats, ComplexHeatmap, patchwork"
                   Enhances License              License_is_FOSS License_restricts_use
singleCellHaystack NA       "MIT + file LICENSE" NA              NA                   
                   OS_type MD5sum NeedsCompilation Built  
singleCellHaystack NA      NA     "no"             "4.2.1"
> singleCellHaystack:::haystack.matrix
function (x, dim1 = 1, dim2 = 2, detection, method = "highD", 
    use.advanced.sampling = NULL, dir.randomization = NULL, scale = TRUE, 
    grid.points = 100, grid.method = "centroid", ...) 
{
    method <- match.arg(method, c("highD", "2D"))
    switch(method, highD = {
        haystack_highD(x, detection = detection, use.advanced.sampling = use.advanced.sampling, 
            dir.randomization = dir.randomization, scale = scale, 
            grid.points = grid.points, grid.method = grid.method, 
            ...)
    }, `2D` = {
        haystack_2D(x[, dim1], x[, dim2], detection = detection, 
            use.advanced.sampling = use.advanced.sampling, dir.randomization = dir.randomization, 
            ...)
    })
}
<bytecode: 0x557aa64c56a0>
<environment: namespace:singleCellHaystack>
> ls()
[1] "dat.expression" "dat.pca"        "dat.tsne"       "pkgs"     
ddiez commented 1 year ago

This is very puzzling. Even though you have installed the latest version somehow your matrix method is wrong (old). This is what I get with my computer:

> pkgs <- installed.packages()
> pkgs[pkgs[, "Package"] == "singleCellHaystack", , drop=FALSE]
                   Package              LibPath                                                          Version  Priority
singleCellHaystack "singleCellHaystack" "/Library/Frameworks/R.framework/Versions/4.2/Resources/library" "0.99.1" NA      
                   Depends Imports                                       LinkingTo
singleCellHaystack NA      "methods, Matrix, splines, ggplot2, reshape2" NA       
                   Suggests                                                                                                                                               
singleCellHaystack "knitr, rmarkdown, testthat, SummarizedExperiment,\nSingleCellExperiment, SeuratObject, cowplot, wrswoR,\nsparseMatrixStats, ComplexHeatmap, patchwork"
                   Enhances License              License_is_FOSS License_restricts_use OS_type MD5sum NeedsCompilation Built  
singleCellHaystack NA       "MIT + file LICENSE" NA              NA                    NA      NA     "no"             "4.2.2"

> singleCellHaystack:::haystack.matrix
function (x, expression, weights.advanced.Q = NULL, dir.randomization = NULL, 
    scale = TRUE, grid.points = 100, grid.method = "centroid", 
    ...) 
{
    haystack_continuous_highD(x, expression = expression, weights.advanced.Q = weights.advanced.Q, 
        dir.randomization = dir.randomization, scale = scale, 
        grid.points = grid.points, grid.method = grid.method, 
        ...)
}
<bytecode: 0x7fc9b751ded0>
<environment: namespace:singleCellHaystack>

As you can see singleCellHaystack:::haystack.matrix is very different to yours. Can you please show me the output of .libPaths()?

RuiGao-1223 commented 1 year ago

It's really confusing.

> .libPaths()
[1] "/home/rui/R/x86_64-pc-linux-gnu-library/4.2"
[2] "/usr/local/lib/R/site-library"              
[3] "/usr/lib/R/site-library"                    
[4] "/usr/lib/R/library" 
ddiez commented 1 year ago

Thanks!

I see... you have so many library locations! You may want to try to fix this as it may cause you problems. For solving your problems with haystack I would try the following:

  1. Uninstall singleCellHaystack.
  2. Check that in each of the library locations above, there is no singleCellHaystack folder. If there is, delete it.
  3. Install sincellCellHaystack from github.
  4. Cross fingers and test again.
RuiGao-1223 commented 1 year ago

Thank you for your suggestions. The package is installed in the path /home/rui/R/x86_64-pc-linux-gnu-library/4.2. And I have checked that there is nothing related to the singleCellHaystack package in the other library paths.

ddiez commented 1 year ago

And you still get the error message?

RuiGao-1223 commented 1 year ago

Yeah, that's right. Also the same error.

ddiez commented 1 year ago

Mmm. Well, I am without ideas now. Do you have anything in your ~/.Rprofile file? This file can contain code that is loaded in R automatically. Not sure if it could make a difference but worth checking.

ddiez commented 1 year ago

Related to my previous comment, can you do the following:

  1. Open a terminal.
  2. Run R --vanilla to open R.
  3. Run library(singleCellExperiment)
  4. Run res <- haystack(dat.tsne, dat.expression).

Do you seen the same error?

RuiGao-1223 commented 1 year ago

I deleted the file named .Rhistory and the program is working! It seems to be that the previous haystack.matrix was imported from this file. Thanks a lot for your help!!!

ddiez commented 1 year ago

Oh, great! Amazing... who could have thought!