yanwu2014 / swne

Similarity Weighted Nonnegative Embedding (SWNE), a method for visualizing high dimensional datasets
BSD 3-Clause "New" or "Revised" License
103 stars 20 forks source link

Error with k=2 #1

Closed scharch closed 6 years ago

scharch commented 6 years ago

FindNumFactors identifies k=2 as the best choice for my data, which makes biological sense. This may mean that swne is not the most natural choice for visualizing my data, but we are interested in identifying subclusters along that axis and I thought it was worth trying. With k=2, I get the following error:

> swne.embedding <- EmbedSWNE(nmf.scores, snn.matrix, alpha.exp = 1.0, snn.exp = 1, n_pull = 4, dist.use = "IC")
Error in cmdscale(d, k) : 'k' must be in {1, 2, ..  n - 1}
In addition: Warning messages:
1: In MASS::bcv(x) : minimum occurred at one end of the range
2: In MASS::bcv(y) : minimum occurred at one end of the range

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mgcv_1.8-23    nlme_3.1-131.1 swne_0.2.1     dplyr_0.7.4    Seurat_2.2.1   Matrix_1.2-11  cowplot_0.9.2 
[8] ggplot2_2.2.1 

thanks

yanwu2014 commented 6 years ago

So SWNE needs at least 3 factors to create a 2D visualization. If there are only 2 factors, then all the data will lie on a line between the two factors which we don't have a visualization for yet. The FindNumFactors function unfortunately still needs a bit of work so if it's telling you 2 factors is best, I would recommend making a PCA Elbow plot and setting k to the number of PCs you would use. It's a hack, but it seems to work okay in general. I've updated the RunNMF method to throw an error if k < 3.

Hope that helps! -Yan