lgatto / pRoloc

A unifying bioinformatics framework for organelle proteomics
http://lgatto.github.io/pRoloc/
15 stars 13 forks source link

highlightOnPlot doesn't work with t-SNE #148

Closed daveshire closed 2 years ago

daveshire commented 2 years ago

highlightOnPlot doesn't work when plot2D is used with t-SNE. The features of interest appear in their PCA locations instead of wherever they ended up in the t-SNE plot.

PCA plot with two FOI sets image

Same FOI sets, same data, plotted using t-SNE image

lmsimp commented 2 years ago

Hello @daveshire, you need to pre-calculate the t-SNE matrix and then pass it to highlightOnPlot

An example below -

library(pRoloc)
library(pRolocdata)

## load some data
data("dunkley2006")

## make a some plots
set.seed(1)
tsne_coords <- plot2D(dunkley2006, method = "t-SNE", plot = FALSE)
pca_coords <- plot2D(dunkley2006, method = "PCA", plot = FALSE)

## some proteins to plot
fn <- featureNames(dunkley2006)[1:3]
fn
[1] "AT1G09210" "AT1G21750" "AT1G51760"

## plot the data and highlight the proteins
par(mfrow = c(1, 2))

## highlight on PCA
plot2D(pca_coords, method = "none", methargs = list(dunkley2006), main = "PCA")
highlightOnPlot(pca_coords, foi = fn, pch = 17, cex = 1.2)

## highlight the same proteins on t-SNE
plot2D(tsne_coords, method = "none", methargs = list(dunkley2006), main = "t-SNE")
highlightOnPlot(tsne_coords, foi = fn, pch = 17, cex = 1.2)

example

> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pRolocdata_1.34.0    pRoloc_1.36.0        BiocParallel_1.30.3 
 [4] MLInterfaces_1.76.0  cluster_2.1.3        annotate_1.74.0     
 [7] XML_3.99-0.10        AnnotationDbi_1.58.0 IRanges_2.30.0      
[10] MSnbase_2.22.0       ProtGenerics_1.28.0  S4Vectors_0.34.0    
[13] mzR_2.30.0           Rcpp_1.0.8.3         Biobase_2.56.0      
[16] BiocGenerics_0.42.0 

loaded via a namespace (and not attached):
  [1] BiocFileCache_2.4.0    plyr_1.8.7             splines_4.2.0         
  [4] listenv_0.8.0          GenomeInfoDb_1.32.2    ggplot2_3.3.6         
  [7] digest_0.6.29          foreach_1.5.2          viridis_0.6.2         
 [10] fansi_1.0.3            magrittr_2.0.3         memoise_2.0.1         
 [13] doParallel_1.0.17      mixtools_1.2.0         limma_3.52.2          
 [16] recipes_0.2.0          globals_0.15.0         Biostrings_2.64.0     
 [19] gower_1.0.0            hardhat_1.1.0          lpSolve_5.6.15        
 [22] prettyunits_1.1.1      colorspace_2.0-3       blob_1.2.3            
 [25] rappdirs_0.3.3         xfun_0.31              dplyr_1.0.9           
 [28] crayon_1.5.1           RCurl_1.98-1.7         hexbin_1.28.2         
 [31] impute_1.70.0          survival_3.3-1         iterators_1.0.14      
 [34] glue_1.6.2             gtable_0.3.0           ipred_0.9-13          
 [37] zlibbioc_1.42.0        XVector_0.36.0         kernlab_0.9-31        
 [40] future.apply_1.9.0     scales_1.2.0           vsn_3.64.0            
 [43] mvtnorm_1.1-3          DBI_1.1.3              viridisLite_0.4.0     
 [46] xtable_1.8-4           progress_1.2.2         clue_0.3-61           
 [49] bit_4.0.4              proxy_0.4-27           mclust_5.4.10         
 [52] preprocessCore_1.58.0  MsCoreUtils_1.8.0      lava_1.6.10           
 [55] prodlim_2019.11.13     sampling_2.9           httr_1.4.3            
 [58] FNN_1.1.3.1            RColorBrewer_1.1-3     ellipsis_0.3.2        
 [61] pkgconfig_2.0.3        nnet_7.3-17            dbplyr_2.2.0          
 [64] utf8_1.2.2             caret_6.0-92           tidyselect_1.1.2      
 [67] rlang_1.0.2            reshape2_1.4.4         munsell_0.5.0         
 [70] tools_4.2.0            LaplacesDemon_16.1.6   cachem_1.0.6          
 [73] cli_3.3.0              generics_0.1.2         RSQLite_2.2.14        
 [76] stringr_1.4.0          fastmap_1.1.0          mzID_1.34.0           
 [79] ModelMetrics_1.2.2.2   knitr_1.39             bit64_4.0.5           
 [82] purrr_0.3.4            randomForest_4.7-1.1   KEGGREST_1.36.2       
 [85] dendextend_1.15.2      ncdf4_1.19             future_1.26.1         
 [88] nlme_3.1-158           xml2_1.3.3             biomaRt_2.52.0        
 [91] compiler_4.2.0         rstudioapi_0.13        filelock_1.0.2        
 [94] curl_4.3.2             png_0.1-7              e1071_1.7-11          
 [97] affyio_1.66.0          tibble_3.1.7           stringi_1.7.6         
[100] lattice_0.20-45        Matrix_1.4-1           vctrs_0.4.1           
[103] pillar_1.7.0           lifecycle_1.0.1        BiocManager_1.30.18   
[106] MALDIquant_1.21        data.table_1.14.2      bitops_1.0-7          
[109] R6_2.5.1               pcaMethods_1.88.0      affy_1.74.0           
[112] gridExtra_2.3          parallelly_1.32.0      codetools_0.2-18      
[115] MASS_7.3-57            gtools_3.9.2.2         assertthat_0.2.1      
[118] withr_2.5.0            GenomeInfoDbData_1.2.8 parallel_4.2.0        
[121] hms_1.1.1              grid_4.2.0             rpart_4.1.16          
[124] timeDate_3043.102      coda_0.19-4            class_7.3-20          
[127] segmented_1.6-0        Rtsne_0.16             pROC_1.18.0           
[130] lubridate_1.8.0