cole-trapnell-lab / monocle3

Other
335 stars 101 forks source link

Choose_cells browser graph not appearing #325

Closed andyrussell closed 4 years ago

andyrussell commented 4 years ago

Describe the bug I am running Monocle3_0.2.1, and when I run the:

choose_cells(monocle.object)

command, the pop-up window is loading but the graph never appears, so I cannot select cells for branch analysis. I was wondering if there was a way of selecting cells by branch (i.e. all cells after a root node). for example, can I select all cells after the 'black 2' node (denoted in blue) without having to invoke the choose_cells command?

Screenshots image

image

sessionInfo(): R version 3.6.3 (2020-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.3 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 LAPACK: /usr/lib/x86_64-linux-gnu/openblas/liblapack.so.3

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] grid stats4 parallel stats graphics grDevices utils datasets methods
[10] base

other attached packages: [1] shiny_1.4.0 ggalluvial_0.11.1 plotly_4.9.2
[4] reshape2_1.4.3 viridis_0.5.1 viridisLite_0.3.0
[7] dplyr_0.8.4 Hmisc_4.3-1 Formula_1.2-3
[10] survival_3.1-8 lattice_0.20-40 gridExtra_2.3
[13] cowplot_1.0.0 Seurat_3.0.0 ggplot2_3.2.1
[16] patchwork_1.0.0 destiny_3.0.1 monocle3_0.2.1
[19] SingleCellExperiment_1.8.0 SummarizedExperiment_1.16.1 DelayedArray_0.12.2
[22] BiocParallel_1.20.1 matrixStats_0.55.0 GenomicRanges_1.38.0
[25] GenomeInfoDb_1.22.0 IRanges_2.20.2 S4Vectors_0.24.3
[28] Biobase_2.46.0 BiocGenerics_0.32.0 flowCore_1.52.1

loaded via a namespace (and not attached): [1] ggthemes_4.2.0 R.methodsS3_1.8.0 coda_0.19-3
[4] tidyr_1.0.2 acepack_1.4.1 knitr_1.28
[7] irlba_2.3.3 multcomp_1.4-12 R.utils_2.9.2
[10] data.table_1.12.8 rpart_4.1-15 RCurl_1.98-1.1
[13] metap_1.3 leidenbase_0.1.0 TH.data_1.0-10
[16] RANN_2.6.1 proxy_0.4-23 future_1.16.0
[19] mutoss_0.1-12 httpuv_1.5.2 assertthat_0.2.1
[22] xfun_0.12 hms_0.5.3 promises_1.1.0
[25] evaluate_0.14 DEoptimR_1.0-8 fansi_0.4.1
[28] caTools_1.18.0 readxl_1.3.1 igraph_1.2.4.2
[31] DBI_1.1.0 htmlwidgets_1.5.1 spdep_1.1-3
[34] purrr_0.3.3 RSpectra_0.16-0 backports_1.1.5
[37] gbRd_0.4-11 RcppParallel_4.4.4 deldir_0.1-25
[40] vctrs_0.2.3 TTR_0.23-6 ROCR_1.0-7
[43] abind_1.4-5 RcppEigen_0.3.3.7.0 withr_2.1.2
[46] grr_0.9.5 robustbase_0.93-6 checkmate_2.0.0
[49] vcd_1.4-6 sctransform_0.2.1 xts_0.12-0
[52] mnormt_1.5-6 cluster_2.1.0 ape_5.3
[55] lazyeval_0.2.2 laeken_0.5.1 crayon_1.3.4
[58] pkgconfig_2.0.3 slam_0.1-47 labeling_0.3
[61] units_0.6-6 nlme_3.1-144 nnet_7.3-13
[64] rlang_0.4.5 globals_0.12.5 lifecycle_0.1.0
[67] sandwich_2.5-1 rsvd_1.0.3 cellranger_1.1.0
[70] RcppHNSW_0.2.0 lmtest_0.9-37 Matrix_1.2-18
[73] raster_3.0-12 carData_3.0-3 boot_1.3-24
[76] zoo_1.8-7 Matrix.utils_0.9.8 base64enc_0.1-3
[79] ggridges_0.5.2 pheatmap_1.0.12 png_0.1-7
[82] bitops_1.0-6 R.oo_1.23.0 KernSmooth_2.23-16
[85] DelayedMatrixStats_1.8.0 classInt_0.4-2 stringr_1.4.0
[88] jpeg_0.1-8.1 scales_1.1.0 magrittr_1.5
[91] plyr_1.8.5 hexbin_1.28.1 ica_1.0-2
[94] gplots_3.0.3 bibtex_0.4.2.2 gdata_2.18.0
[97] zlibbioc_1.32.0 compiler_3.6.3 lsei_1.2-0
[100] RColorBrewer_1.1-2 plotrix_3.7-7 pcaMethods_1.78.0
[103] fitdistrplus_1.0-14 cli_2.0.2 XVector_0.26.0
[106] LearnBayes_2.15.1 listenv_0.8.0 pbapply_1.4-2
[109] htmlTable_1.13.3 ggplot.multistats_1.0.0 MASS_7.3-51.5
[112] tidyselect_1.0.0 stringi_1.4.6 forcats_0.5.0
[115] yaml_2.2.1 latticeExtra_0.6-29 ggrepel_0.8.1
[118] pbmcapply_1.5.0 tools_3.6.3 future.apply_1.4.0
[121] rio_0.5.16 rstudioapi_0.11 foreign_0.8-75
[124] smoother_1.1 scatterplot3d_0.3-41 farver_2.0.3
[127] Rtsne_0.15 digest_0.6.25 Rcpp_1.0.3
[130] car_3.0-7 SDMTools_1.1-221.2 later_1.0.0
[133] RcppAnnoy_0.0.15 httr_1.4.1 sf_0.9-1
[136] npsurv_0.4-0 Rdpack_0.11-1 colorspace_1.4-1
[139] ranger_0.12.1 reticulate_1.14 splines_3.6.3
[142] uwot_0.1.5 sn_1.5-5 expm_0.999-4
[145] sp_1.4-1 multtest_2.42.0 spData_0.3.3
[148] xtable_1.8-4 jsonlite_1.6.1 R6_2.4.1
[151] gmodels_2.18.1 TFisher_0.2.0 mime_0.9
[154] pillar_1.4.3 htmltools_0.4.0 fastmap_1.0.1
[157] glue_1.3.1 VIM_5.1.1 class_7.3-15
[160] codetools_0.2-16 tsne_0.1-3 mvtnorm_1.1-0
[163] tibble_2.1.3 numDeriv_2016.8-1.1 curl_4.3
[166] gtools_3.8.1 zip_2.0.4 openxlsx_4.1.4
[169] rmarkdown_2.1 munsell_0.5.0 e1071_1.7-3
[172] GenomeInfoDbData_1.2.2 haven_2.2.0 gtable_0.3.0

Additional context

I installed Monocle using devtools::install_github('cole-trapnell-lab/monocle3') which gave me this version of Monocle so if this has been solved in Monocle 3.1, would there be a way of pointing me to this version (sorry for the stupid request in advance)?

hpliner commented 4 years ago

Hello, What platform are you running this on? I.e. Rstudio, R in a terminal on a local computer, R on a cluster, IPython?

As to selecting by branch, there isn't currently a function implemented to do this (though it's on the list). As a work around, you can access the closest principal graph node vertex for each cell and assign it as a column in your colData table using:

colData(cds)$closest_vertex <- cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex[,1]
plot_cells(cds, color_cells_by = "closest_vertex", label_cell_groups = FALSE)

Based on which nodes you want, you can then subset in the usual way

cds_sub <- cds[,colData(cds)$closest_vertex %in% c(1, 2, 3)]
andyrussell commented 4 years ago

Hi,

Thank you so much for your quick response, I really appreciate it!

I've just tried this out and it does work to some extent. I changed the UMAP projection around but it is essentially the same set of branches I am hoping to subset. I currently have this result when I run the command suggested above:

image

My workaround idea at the moment is to use the clusters generated either by Monocle or previously from Seurat to subset the cells that are near the branch. I know this is not perfect as it relies on me eyeballing which cells are there and I could end up with a few cells lost or gained. I'm guessing the code above is meant to come up with numbers that are closer to the number of branches? I.e. assign each cell to a branch?

For full transparency, this is the code I used upstream:

`

load package

library(monocle3)

extract data

wt_cells <- rownames(test_seurat_object@meta.data[which(test_seurat_object@meta.data$identity_combined == "WT" | test_seurat_object@meta.data$identity_combined == "WT_10X"),]) seurat.object <-SubsetData(test_seurat_object, cells = wt_cells)

extract data from Seurat

seurat.object <- test_seurat_object

counts

data <- as(as.matrix(GetAssayData(seurat.object, assay = "integrated", slot = "data")), 'sparseMatrix')

meta data

pd <- data.frame(seurat.object@meta.data)

keep only the columns that are relevant

pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)

add gene short name

fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))

Construct monocle cds

monocle.object <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)

preprocess

monocle.object = preprocess_cds(monocle.object, num_dim = 100, norm_method = "none")

plot variance explained plot

plot_pc_variance_explained(monocle.object)

make monocle UMAP

monocle.object = reduce_dimension(monocle.object, reduction_method = "UMAP", preprocess_method = "PCA", umap.metric = "euclidean", umap.n_neighbors = 20, umap.min_dist = 0.5, verbose = FALSE)

plot_cells(monocle.object)

add UMAP from Seurat

monocle.object@int_colData@listData$reducedDims@listData[["UMAP"]] <-seurat.object@reductions[["umapoptimised"]]@cell.embeddings plot_cells(monocle.object)

cluster

monocle.object = cluster_cells(monocle.object)

plot clusters

plot_cells(monocle.object, color_cells_by="partition", group_cells_by="partition")

reduce partitions to 1

monocle.object@clusters$UMAP$partitions[monocle.object@clusters$UMAP$partitions == "2"] <- "1"

map pseudotime

monocle.object = learn_graph(monocle.object, learn_graph_control=list(ncenter=500), use_partition = FALSE) plot_cells(monocle.object, color_cells_by="partition", group_cells_by="partition")

a helper function to identify the root principal points:

make cluster 2 the root

get_earliest_principal_node <- function(cds, time_bin="2"){ cell_ids <- which(colData(cds)[, "seurat_clusters_plotting"] == time_bin) closest_vertex <- cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex closest_vertex <- as.matrix(closest_vertex[colnames(cds), ]) root_pr_nodes <- igraph::V(principal_graph(cds)[["UMAP"]])$name[as.numeric(names (which.max(table(closest_vertex[cell_ids,]))))]

root_pr_nodes }

calculate pseudotime

monocle.object = order_cells(monocle.object, root_pr_nodes=get_earliest_principal_node(monocle.object))

plot

plot_cells(monocle.object, color_cells_by = "pseudotime", label_cell_groups=FALSE, cell_size = 1) + coord_fixed() + theme_void() + labs(title = "Pseudotime") + theme(plot.title = element_text(hjust = 0.5)) `

This gave me the following plot:

image

Also, to answer your first question, I am running R studio on a server. I really don't know the exact specification but there is more info here: https://www.sanger.ac.uk/science/groups/cellular-genetics-informatics - I hope this is helpful?

Thank you,

Andy

hpliner commented 4 years ago

Hi Andy, On the first question, that's going to be the problem - we haven't figured out how to get the interactive pieces of monocle working on servers of any kind (ipython, clusters).

The idea of the code I gave above is to get assignments to the nearest principle node (basically the points that the black line is tracing). With the number of principal nodes you have it's going to be a pain though... (if there were only a few you could subset the cells based on assignment to those few).

My (very unsatisfactory) suggestion for the moment is to download your cds someplace local and then use choose_cells to pick the right ones, and then upload your subsetted cds back to the server... This enhancement is on the list, but in the short term I think this is your best bet

hpliner commented 4 years ago

Whoops, apparently I wrote a function to interactively choose paths through graphs and then forgot about it! Check out choose_graph_segments if you decide to go the downloading/uploading way.

erkinacar5 commented 2 years ago

Hello @hpliner , sorry for ressurecting this issue.

I was just wondering if there is any update about getting the interactive pieces of monocle working on servers? I am the admin of our Rstudio Server in our group, and many people in our group would appreciate if there is a way (ideally something that I can implement or fix so that they get the interactive part with no issues).

Thanks in advance for your reply :)