Closed JoannaTan closed 5 years ago
Hi,
Sorry for the late reply. That is kinda strange. So I'm not able to reproduce your plot but I'm also not getting the target tolerance warning which may be a factor. Can you try reinstalling the latest version of NNLM and then SWNE and see if that helps? Let me know if you're still getting that same plot afterwards.
devtools::install_github("linxihui/NNLM")
devtools::install_github("yanwu2014/swne")
Hi @yanwu2014 ,
Thank you for your reply.
I re-installed the latest version of swne and NNLM and I am still getting the following swne_pbmc_new_20190604.pdf
I am still getting the same warning messages.
Or should I install an older version of NNLM?
Okay I increased the default max.iter when running the NMF. Can you try reinstalling swne and seeing if that helps? Also can you run sessionInfo()
and paste the output? I wonder if it has something to do with the ICA initialization
Hi @yanwu2014,
I have re-installed swne and re-run the codes but I am still not getting the results and I am still getting the same warnings. Please find attached my plot 20190606_swne_plot.pdf
please find below my sessionInfo()
sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] Seurat_3.0.1 swne_0.5.4 SingleCellExperiment_1.4.1 SummarizedExperiment_1.12.0 DelayedArray_0.8.0
[6] BiocParallel_1.14.2 matrixStats_0.54.0 Biobase_2.40.0 GenomicRanges_1.34.0 GenomeInfoDb_1.18.2
[11] IRanges_2.16.0 S4Vectors_0.20.1 BiocGenerics_0.28.0 edgeR_3.24.3 limma_3.36.5
loaded via a namespace (and not attached):
[1] Rtsne_0.15 colorspace_1.4-0 ggridges_0.5.1 rprojroot_1.3-2 XVector_0.22.0 fs_1.2.6 proxy_0.4-22
[8] rstudioapi_0.9.0 liger_1.1 listenv_0.7.0 npsurv_0.4-0 remotes_2.0.2 ggrepel_0.8.1 codetools_0.2-16
[15] splines_3.5.2 R.methodsS3_1.7.1 lsei_1.2-0 pkgload_1.0.2 jsonlite_1.6 umap_0.2.2.0 ica_1.0-2
[22] cluster_2.0.7-1 png_0.1-7 R.oo_1.22.0 sctransform_0.2.0 compiler_3.5.2 httr_1.4.0 backports_1.1.3
[29] assertthat_0.2.1 Matrix_1.2-15 lazyeval_0.2.1 cli_1.0.1 prettyunits_1.0.2 htmltools_0.3.6 tools_3.5.2
[36] bindrcpp_0.2.2 rsvd_1.0.0 igraph_1.2.2 gtable_0.2.0 glue_1.3.0 GenomeInfoDbData_1.2.0 RANN_2.6.1
[43] reshape2_1.4.3 dplyr_0.7.8 Rcpp_1.0.1 gdata_2.18.0 ape_5.2 nlme_3.1-137 gbRd_0.4-11
[50] lmtest_0.9-36 stringr_1.3.1 ps_1.3.0 globals_0.12.4 irlba_2.3.2 gtools_3.8.1 devtools_2.0.2
[57] NNLM_0.4.2 future_1.13.0 MASS_7.3-51.1 zlibbioc_1.28.0 zoo_1.8-4 scales_1.0.0 RColorBrewer_1.1-2
[64] curl_3.3 memoise_1.1.0 reticulate_1.10 pbapply_1.3-4 gridExtra_2.3 ggplot2_3.1.0 reshape_0.8.8
[71] stringi_1.2.4 desc_1.2.0 caTools_1.17.1.1 pkgbuild_1.0.2 bibtex_0.4.2 Rdpack_0.10-1 SDMTools_1.1-221
[78] rlang_0.3.1 pkgconfig_2.0.2 bitops_1.0-6 lattice_0.20-38 ROCR_1.0-7 purrr_0.3.0 bindr_0.1.1
[85] labeling_0.3 htmlwidgets_1.3 processx_3.2.1 cowplot_0.9.4 tidyselect_0.2.5 plyr_1.8.4 magrittr_1.5
[92] R6_2.3.0 snow_0.4-3 gplots_3.0.1.1 mgcv_1.8-26 withr_2.1.2 pillar_1.3.1 fitdistrplus_1.0-14
[99] survival_2.43-3 RCurl_1.95-4.12 tibble_2.0.1 future.apply_1.2.0 tsne_0.1-3 crayon_1.3.4 KernSmooth_2.23-15
[106] plotly_4.8.0 usethis_1.4.0 locfit_1.5-9.1 grid_3.5.2 data.table_1.12.0 FNN_1.1.3 callr_3.1.1
[113] metap_1.0 digest_0.6.18 tidyr_0.8.2 R.utils_2.7.0 usedist_0.1.0 munsell_0.5.0 viridisLite_0.3.0
[120] sessioninfo_1.1.1
The package versions all look the same as what I have so that's not the problem. Would it be possible to see your code (I know it should be the same as the example)?
Also, can you check if the pagoda2 example works for you?
Btw thanks for your patience! I've never seen this before so it's taking me some time to figure it out.
Hi,
No worries. I would really like to use swne for my work.
Please find below the codes
obj <- readRDS("pbmc3k_final.RObj")
var.genes <- VariableFeatures(obj)
length(var.genes)
cell.clusters <- Idents(obj)
levels(cell.clusters)
genes.embed <- c("MS4A1", "GNLY", "CD3E", "CD14","FCER1A", "FCGR3A", "LYZ", "PPBP", "CD8A")
swne.embedding <- RunSWNE(obj, k = 16, genes.embed = genes.embed)
PlotSWNE(swne.embedding, alpha.plot = 0.4, sample.groups = cell.clusters,do.label = T, label.size = 3.5, pt.size = 1, show.legend = F, seed = 42)
I tried to follow the pagoda2 example and it did not work too. swne_pagoda2_20190606.pdf I am still getting this warning message
Warning message:
In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
Please find below the codes which I used for the pagoda2 example
library(swne)
library(pagoda2)
library(Matrix)
p2 <-readRDS("pbmc3k_pagoda2.Robj")
n.od.genes <- 1.5e3
var.info <- p2$misc$varinfo; var.info <- var.info[order(var.info$lp, decreasing = F),];
od.genes <- rownames(var.info[1:n.od.genes,])
length(od.genes)
clusters <- p2$clusters$PCA$multilevel
levels(clusters)
genes.embed <- c("MS4A1", "GNLY", "CD3E", "CD14", "FCER1A", "FCGR3A", "LYZ", "PPBP", "CD8A")
swne.embedding <- RunSWNE(p2, k = 14, var.genes = od.genes, genes.embed = genes.embed)
PlotSWNE(swne.embedding, alpha.plot = 0.4, sample.groups = clusters,
do.label = T, label.size = 3.5, pt.size = 1.5, show.legend = F,seed = 42)
Hmm it looks like the error is with the NMF. Do you mind running through the SWNE pipeline step by step with the Seurat tutorial? If you do, when you run RunNMF
do you get the same warning?
I ran through the Seurat tutorial step by step. When I ran through the step-by-step tutorial, I managed to get the plot. Swne_plot_stepBystep.pdf
However, I get warning messages when I run FindNumFactors, and my plot looks different from the tutorial find_num_of_factors_plot_swne.pdf
k.err <- FindNumFactors(norm.counts[var.genes,], k.range = k.range, n.cores = 8, do.plot = T)
Warning messages:
1: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
2: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
3: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
4: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
5: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
6: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
7: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, :
Target tolerance not reached. Try a larger max.iter.
No warning messages were seen when I ran runNMF
k <- 16
nmf.res <- RunNMF(norm.counts[var.genes,], k = k)
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Looks like, I can only get the plot if I run through the process step-by-step.
Ah good to hear the step by step tutorial is working.
So I pushed an update to RunSWNE with an additional parameter ica.fast
. Can you try setting ica.fast = F
in RunSWNE and see if that helps? If not, do you mind just using SWNE step by step?
Best, Yan
Hi @yanwu2014 ,
I tried running RunSWNE with the parameter that you suggested but it still does not work.
swne.embedding <- RunSWNE(obj, k = 16, genes.embed = genes.embed, ica.fast=F)
[1] "2000 variable genes to use"
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Initial stress : 0.19693
stress after 10 iters: 0.06764, magic = 0.461
stress after 20 iters: 0.06091, magic = 0.500
stress after 30 iters: 0.05964, magic = 0.500
stress after 40 iters: 0.05945, magic = 0.500
stress after 50 iters: 0.05944, magic = 0.500
Warning messages:
1: In NNLM::nnlm(t(H), t(newdata), alpha = alpha, loss = loss, n.threads = n.cores) :
x does not have a full column rank. Solution may not be unique.
2: In if (return.format == "embedding") { :
the condition has length > 1 and only the first element will be used
I looked at the plot that I got through the step by step, I noticed that although I get different clusters, but the genes that are associated with each cluster is different from the one in the tutorial.
Ah yes so the NMF is semi-random, and will result in the "factors" having different genes associated with them for different runs is fairly normal. The plot will look a little different run to run as well for the same reason.
Do you think the step by step SWNE will work for your purposes?
Hi @yanwu2014
Yes, the step by step SWNE will work for my case. I want to use Swne to cluster single cells.
Glad to know that the plot will look slightly different for each run.
Thank you very much for your help.
Great let me know if you have any other questions!
Hi,
I followed the example given in https://yanwu2014.github.io/swne/Examples/pbmc3k_swne_seurat.html
However, I could not get back the same clustering as shown in the example (please find attached the clusters test_swne.pdf )
The following warning messages were obtained while I ran the following command:
swne.embedding <- RunSWNE(obj, k = 16, genes.embed = genes.embed)
Warning messages: 1: In system.time(out <- .Call("NNLM_nnmf", A, as.integer(k), init.mask$Wi, : Target tolerance not reached. Try a larger max.iter. 2: In if (return.format == "embedding") { : the condition has length > 1 and only the first element will be used
Is the difference in clustering due to an issue with NNLM package? I noted that it was removed from the CRAN repository.