Closed fbrundu closed 6 years ago
Hello Francesco,
I wonder if you directly compare the raw count matrix and the imputed count matrix. Are the counts all the same for each entry? If that's the case, please report it here and I will investigate why the method does not work.
Thanks, Vivian
Hi Vivian,
I'm a colleague of Francesco---we tried imputing with k=1 and there was not difference from the input array. Should values still be imputed if k=1?
Hello dmoaks,
Yes if you set Kcluster = 1, scImpute still tries to impute the gene expression. I tested the package on multiple datasets and was able to get imputed results with Kcluster = 1.
If you have verified that you are using the latest release and all the imputed values are exactly the same as the raw expression, can you send me a smaller test dataset to diagnose the problem?
Thanks, Vivian
Hi Vivian,
I've tried with several data sets and am still getting no change when kcluster=1. I have the most current version of scImpute.
Thanks, Dan
Hello Dan,
I just updated the package and it now should work on your data. Thanks very much for your feedback and please let me know if you have further questions.
Hi Vivian,
Thanks for the quick responses. I updated scImpute and tried again with k=1 with still no changes from pre-imputation. Here are my session details:
scimpute(# full path to raw count matrix
- count_path = "preimpute_1Krandom.txt",
- infile = "txt", # format of input file
- outfile = "txt", # format of output file
- out_dir = "./", # full path to output directory
- labeled = FALSE, # cell type labels not available
- drop_thre = 0.9, # threshold set on dropout probability
- Kcluster = 1, # 2 cell subpopulations
- ncores = 1) # number of cores used in parallel computation [1] "reading in raw count matrix ..." [1] "number of genes in raw count matrix 1000" [1] "number of cells in raw count matrix 1000" [1] "estimating dropout probability for type 1 ..." [1] "imputing dropout values for type 1 ..." [1] "writing imputed count matrix ..." integer(0) sessionInfo() R version 3.4.3 (2017-11-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] scImpute_0.0.5 doParallel_1.0.11 iterators_1.0.9
[4] foreach_1.4.4 penalized_0.9-50 survival_2.41-3
[7] kernlab_0.9-25
loaded via a namespace (and not attached):
[1] Rcpp_0.12.15 knitr_1.18 devtools_1.13.4 splines_3.4.3
[5] munsell_0.4.3 colorspace_1.3-2 lattice_0.20-35 R6_2.2.2
[9] rlang_0.1.6 httr_1.3.1 plyr_1.8.4 tools_3.4.3
[13] grid_3.4.3 gtable_0.2.0 git2r_0.20.0 withr_2.1.1
[17] lazyeval_0.2.1 digest_0.6.13 tibble_1.4.1 Matrix_1.2-12
[21] ggplot2_2.2.1 codetools_0.2-15 curl_3.1 memoise_1.1.0
[25] compiler_3.4.3 pillar_1.0.1 scales_0.5.0
Hello Dan,
That's surprising. The printed messages look correct, so I would suggest that you first make sure the newest package is successfully installed. Also, please check if you are loading the correct version of input and output.
I'm attaching the code I used for testing here:
` rm(list = ls()) library(scImpute)
count_path = "./preimpute_1Krandom.txt"
K = 1 drop_thre = 0.5 ncores = 30 out_dir = "./" dir.create(out_dir)
scimpute(count_path, infile = "txt", outfile = "txt", out_dir, labeled = FALSE, drop_thre = 0.5, Kcluster = K, ncores = ncores)
count = read.table(count_path, row.names = 1, header = TRUE) imp_count = read.table("./scimpute_count.txt", header = TRUE, row.names = 1) sum(abs(count - imp_count)) `
This scatterplot (log scale) shows that imputation is working. K1.pdf
I'm also attaching the session info of R. I do notice that we use different versions of R but I think this is not supposed to the cause.
R version 3.4.1 (2017-06-30) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS
Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/lapack/liblapack.so.3.0
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=Cattached base packages: [1] parallel stats graphics grDevices utils datasets methods
[8] baseother attached packages: [1] scImpute_0.0.5 doParallel_1.0.11 iterators_1.0.8 foreach_1.4.3
[5] penalized_0.9-50 survival_2.41-3 kernlab_0.9-25loaded via a namespace (and not attached): [1] compiler_3.4.1 Matrix_1.2-11 Rcpp_0.12.13 codetools_0.2-15 [5] splines_3.4.1 grid_3.4.1 lattice_0.20-35
I can confirm that now imputation with k = 1
is working on my side, we can close.
Dear Vivian, I am running scImpute on the 293T dataset, that should be the same used in the scImpute paper. Using k = 1 (cells are clustered with the same cell type), scImpute does not impute any value. You may see in the attached image that the percentiles are the same for the raw and imputed datasets. Is this the correct behavior?
Thanks, Francesco