Open tdignum opened 4 years ago
Hi, I can only guess why you see these differences. The find_gene_modules() used originally the Louvain community detection algorithm but this was replaced by the Leiden algorithm about a year ago. This is likely to affect the results. You say that you tried adjusting the resolution parameter. Did you try very small values, for example, 1e-7?
Hi, I have tried very small values, as low as 1e-11, so I think the issues may be caused by the switch to the Leiden algorithm as you suggest.
Thank you for your help! -Tessa
From: brgew notifications@github.com Reply-To: cole-trapnell-lab/monocle3 reply@reply.github.com Date: Monday, October 5, 2020 at 11:42 AM To: cole-trapnell-lab/monocle3 monocle3@noreply.github.com Cc: "Dignum, Tessa M" tdignum@fredhutch.org, Author author@noreply.github.com Subject: Re: [cole-trapnell-lab/monocle3] Monocle3 find_gene_modules issue (#425)
Hi, I can only guess why you see these differences. The find_gene_modules() used originally the Louvain community detection algorithm but this was replaced by the Leiden algorithm about a year ago. This is likely to affect the results. You say that you tried adjusting the resolution parameter. Did you try very small values, for example, 1e-7?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cole-2Dtrapnell-2Dlab_monocle3_issues_425-23issuecomment-2D703816505&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=EvxRncb3AODkC7cNW0lqkn8Cd2S3Bu7it4sh-GPccmI&s=RP4b1o2O51kxioFNswp0Q8RNmzpcVZdO4MPXCI-JGPw&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ARCWFGTWVTQQYY5N3QAOTRLSJIHPXANCNFSM4RUXAJXQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=EvxRncb3AODkC7cNW0lqkn8Cd2S3Bu7it4sh-GPccmI&s=A_JpYiz-L25WDeIHKG3UkDGGCEaJ60AJJJ4LMNqpAoY&e=.
Hi Tessa, We made a change that may affect the find_gene_modules() function although the change does not revert the code (and so it does not explain why your plots differ from those made using an earlier version). You can install the modified version from the 'develop' branch on the Monocle3 GitHub site using the command devtools::install_github('cole-trapnell-lab/monocle3', ref="develop"). I am curious to know whether this affects the result plot.
Hi Tessa, We made a change that may affect the find_gene_modules() function although the change does not revert the code (and so it does not explain why your plots differ from those made using an earlier version). You can install the modified version from the 'develop' branch on the Monocle3 GitHub site using the command devtools::install_github('cole-trapnell-lab/monocle3', ref="develop"). I am curious to know whether this affects the result plot.
Hi, Thanks for providing such useful package. I also experienced the same problem with updated monocle, and it wasn't solved after installing 'develop' branch. Is there now a way of solving this problem? I would really appreciate your help
Best, Huan
Hi,
Could you try running find_gene_modules() with the random_seed parameter set to a positive, non-zero, value? Does that give in consistent gene modules?
Hi,
Could you try running find_gene_modules() with the random_seed parameter set to a positive, non-zero, value? Does that give in consistent gene modules?
Hi,
Thanks for your reply. The number of modules I get is consistent, but the color is very uniform when I plot them on umap(see attached). This problem remains even when I set random_seed to a positive non-zero value.
Best, Huan
Hello, Running the set.seed function with a positive integer value did give me consistent gene modules for any given resolution. In terms of the uniform color issue while plotting gene modules in UMAP, adjusting minimum expression (min_expr=) to set a lower limit of expression solved this issue for me. I hope that helps!
Best, Tessa
From: huanz4 @.> Reply-To: cole-trapnell-lab/monocle3 @.> Date: Tuesday, March 30, 2021 at 2:40 AM To: cole-trapnell-lab/monocle3 @.> Cc: "Dignum, Tessa M" @.>, Author @.***> Subject: Re: [cole-trapnell-lab/monocle3] Monocle3 find_gene_modules issue (#425)
Hi,
Could you try running find_gene_modules() with the random_seed parameter set to a positive, non-zero, value? Does that give in consistent gene modules?
Hi,
Thanks for your reply. The number of modules I get is consistent, but the color is very uniform when I plot them on umap(see attached). This problem remains even when I set random_seed to a positive non-zero value.
Best, Huan [Image removed by sender. bbb]https://urldefense.proofpoint.com/v2/url?u=https-3A__user-2Dimages.githubusercontent.com_75174439_112968481-2Da2373080-2D914c-2D11eb-2D8c06-2D95858dc61d5e.png&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=Jwg-hKaryOdEi42NLmEkJ9vYbdLpsQSQOE_R9SZW-H8&e=
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cole-2Dtrapnell-2Dlab_monocle3_issues_425-23issuecomment-2D810076101&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=2sTjvLmClO12vERAYwb0tu_tesgTeNdKjOvJYx7P6P4&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ARCWFGQRYPCGUUGWPYGTTQ3TGGMB3ANCNFSM4RUXAJXQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=_qjorA2Qe9P5hFDY6Sw9manNqxU8uI6COFC3OrYR1d0&e=.
Hello, Running the set.seed function with a positive integer value did give me consistent gene modules for any given resolution. In terms of the uniform color issue while plotting gene modules in UMAP, adjusting minimum expression (min_expr=) to set a lower limit of expression solved this issue for me. I hope that helps! Best, Tessa From: huanz4 @.> Reply-To: cole-trapnell-lab/monocle3 @.> Date: Tuesday, March 30, 2021 at 2:40 AM To: cole-trapnell-lab/monocle3 @.> Cc: "Dignum, Tessa M" @.>, Author @.***> Subject: Re: [cole-trapnell-lab/monocle3] Monocle3 find_gene_modules issue (#425) Hi, Could you try running find_gene_modules() with the random_seed parameter set to a positive, non-zero, value? Does that give in consistent gene modules? Hi, Thanks for your reply. The number of modules I get is consistent, but the color is very uniform when I plot them on umap(see attached). This problem remains even when I set random_seed to a positive non-zero value. Best, Huan [Image removed by sender. bbb]https://urldefense.proofpoint.com/v2/url?u=https-3A__user-2Dimages.githubusercontent.com_75174439_112968481-2Da2373080-2D914c-2D11eb-2D8c06-2D95858dc61d5e.png&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=Jwg-hKaryOdEi42NLmEkJ9vYbdLpsQSQOE_R9SZW-H8&e= — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cole-2Dtrapnell-2Dlab_monocle3_issues_425-23issuecomment-2D810076101&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=2sTjvLmClO12vERAYwb0tu_tesgTeNdKjOvJYx7P6P4&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ARCWFGQRYPCGUUGWPYGTTQ3TGGMB3ANCNFSM4RUXAJXQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=_qjorA2Qe9P5hFDY6Sw9manNqxU8uI6COFC3OrYR1d0&e=.
Hi @tdignum @brgew
Thanks for your response. I tried to set a minimum expression value of 10, and did get some more defined gene modules, but then some modules have no expression at all and some still have this high uniform expression. I would like to find gene modules more highly expressed in the bottom left population, but I am not sure how I should approach this.
Sorry, I am new to coding and single cell RNA seq analysis. Please see below my code and plot. I don't seem to have the problem with consistent gene modules after setting random seed to a non zero positive value.
gene_module_df = find_gene_modules(Day5_8_healthy_subset[pr_deg_ids,],weight=TRUE,cores = 12, partition_qval = 0.01,resolution = c(10^seq(-6,-2)),random_seed = 10)
plot_cells(Day5_8_healthy_subset, genes=gene_module_df, show_trajectory_graph=FALSE, label_cell_groups=FALSE,min_expr = 10)
Thanks a lot for your help
Best, Huan
Hello, In addition to setting the seed value and adjusting the minimum expression threshold, you might try using a different ggplot color scale (see bold code below). I find that this helps with the coloring of my gene modules in UMAP because this plotting scale sets the lowest expression score color (gray80) according to the minimum expression value in your data. In contrast, the default plot_cells option seems to set the lowest expression score color (purple) starting at a 0, meaning that the color gradient (purple-to-green) encompasses a much larger range of gene expression values. If there is very little variability in the expression levels of different genes in your data, everything tends to plot as a very similar color. At least this is what I think is happening... The scale_color_gradient should help solve this issue by decreasing the size of that range (instead of 0 expression-maximum expression, it plots from minimum expression-maximum expression). Hopefully that makes sense, and helps solve your issue! That is the only other thing that I can think of that might help.
Good luck! Tessa
plot_cells(cds2, genes=gene_module_df %>% filter(module %in% c(9)), label_cell_groups=FALSE, show_trajectory_graph=FALSE, cell_size = 1, scale_to_range = TRUE, min_expr = 35) +scale_color_gradientn(colors=c("gray80", "royalblue3", "royalblue3")) +simple_theme
From: huanz4 @.> Date: Friday, April 2, 2021 at 3:03 AM To: cole-trapnell-lab/monocle3 @.> Cc: Dignum, Tessa M @.>, Mention @.> Subject: Re: [cole-trapnell-lab/monocle3] Monocle3 find_gene_modules issue (#425)
Hello, Running the set.seed function with a positive integer value did give me consistent gene modules for any given resolution. In terms of the uniform color issue while plotting gene modules in UMAP, adjusting minimum expression (min_expr=) to set a lower limit of expression solved this issue for me. I hope that helps! Best, Tessa From: huanz4 @.> Reply-To: cole-trapnell-lab/monocle3 @.> Date: Tuesday, March 30, 2021 at 2:40 AM To: cole-trapnell-lab/monocle3 @.> Cc: "Dignum, Tessa M" @.>, Author @.***> Subject: Re: [cole-trapnell-lab/monocle3] Monocle3 find_gene_modules issue (#425https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cole-2Dtrapnell-2Dlab_monocle3_issues_425&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=N1FkvgmMWpJnsLp7NJlygr0MI2IDrbxXe6oUb2MwzHw&s=eLTe6JfFZBMmABGLybsvhZmZA3M-1OatDf26wdG6rI8&e=) Hi, Could you try running find_gene_modules() with the random_seed parameter set to a positive, non-zero, value? Does that give in consistent gene modules? Hi, Thanks for your reply. The number of modules I get is consistent, but the color is very uniform when I plot them on umap(see attached). This problem remains even when I set random_seed to a positive non-zero value. Best, Huan [Image removed by sender. bbb]https://urldefense.proofpoint.com/v2/url?u=https-3A__user-2Dimages.githubusercontent.com_75174439_112968481-2Da2373080-2D914c-2D11eb-2D8c06-2D95858dc61d5e.png&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=Jwg-hKaryOdEi42NLmEkJ9vYbdLpsQSQOE_R9SZW-H8&e= — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cole-2Dtrapnell-2Dlab_monocle3_issues_425-23issuecomment-2D810076101&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=2sTjvLmClO12vERAYwb0tu_tesgTeNdKjOvJYx7P6P4&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ARCWFGQRYPCGUUGWPYGTTQ3TGGMB3ANCNFSM4RUXAJXQ&d=DwMCaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=_0etAfPObxxBQNYxaAB59mrep5V0QQ1G7YvoOH2XZCc&s=_qjorA2Qe9P5hFDY6Sw9manNqxU8uI6COFC3OrYR1d0&e=.
Hi @tdignumhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_tdignum&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=N1FkvgmMWpJnsLp7NJlygr0MI2IDrbxXe6oUb2MwzHw&s=HVgg8LOUOnDwcCq-YjP5s_8IiXqlzjGIA5Cd3IvZi_4&e= @brgewhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_brgew&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=N1FkvgmMWpJnsLp7NJlygr0MI2IDrbxXe6oUb2MwzHw&s=jZc3Wzb7-U584kvVXVNCTj-xR0cE7tR9jW21s5tjvh4&e=
Thanks for your response. I tried to set a minimum expression value of 10, and did get some more defined gene modules, but then some modules have no expression at all and some still have this high uniform expression. I would like to find gene modules more highly expressed in the bottom left population, but I am not sure how I should approach this.
Sorry, I am new to coding and single cell RNA seq analysis. Please see below my code and plot. I don't seem to have the problem with consistent gene modules after setting random seed to a non zero positive value.
gene_module_df = find_gene_modules(Day5_8_healthy_subset[pr_deg_ids,],weight=TRUE,cores = 12, partition_qval = 0.01,resolution = c(10^seq(-6,-2)),random_seed = 10) plot_cells(Day5_8_healthy_subset, genes=gene_module_df, show_trajectory_graph=FALSE, label_cell_groups=FALSE,min_expr = 10)
Thanks a lot for your help
Best, Huan
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_cole-2Dtrapnell-2Dlab_monocle3_issues_425-23issuecomment-2D812464157&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=N1FkvgmMWpJnsLp7NJlygr0MI2IDrbxXe6oUb2MwzHw&s=fehoHrmCHGq5OiCEnVSLBsvrFHlfERzMN-NOoxN3tV4&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ARCWFGXB7O4UAHUNDGZ6P6DTGWJAPANCNFSM4RUXAJXQ&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=78DK-CiM6MPKV_9ta0Qb2X3tcOMrjG57CmAKXOgjoDU&m=N1FkvgmMWpJnsLp7NJlygr0MI2IDrbxXe6oUb2MwzHw&s=rblpCzpTVu8n6rMAnRXl8JAwUgO6KZhmt1cYVAO5hWI&e=.
scale_color_gradientn(colors=c("gray80", "royalblue3", "royalblue3"))
@tdignum Thanks for your suggestion. I think the problem might be the expression score not plotted as z-score? I tried to set minimum expression value and it worked for certain modules. I had to set different minimum expression value for different modules to get this nice plot (see below), so I couldn't plot them on the same graph. I guess we wouldn't have to do that if the expression score was on z-scale?
@tdignum @brgew @ctrapnell @kou Do you know how to plot this on a z-scale? I think before I upgraded to the new version of monocle, it was on z-scale.
Best, Huan
plot_cells(Day5_8_healthy_subset, genes=gene_module_df %>% filter(module %in% c(12,13,49,50,46)), group_cells_by="cluster", color_cells_by="cluster", show_trajectory_graph=FALSE, min_expr = 10)
plot_cells(Day5_8_healthy_subset, genes=gene_module_df %>% filter(module %in% c(32,44,30)), group_cells_by="cluster", color_cells_by="cluster", show_trajectory_graph=FALSE, min_expr = 15)
When I run gene module analysis on my data, I get gene modules with highly uniform expression score values throughout UMAP space and only non-zero expression (see photos). So does another colleague in my lab using entirely different data. I think that this could be an issue with how the expression scores are calculated or plotted in the newer versions of monocle3 or ggplot, because my PI was previously able to get gene modules that reflected more subtle gene expression variations using the same data and code but with an older version of monocle3 and ggplot. His gene module plots made with the older versions of monocle3 and ggplot also had a different scale for expression score. When he updated his versions of monocle3 and ggplot, he now gets the same uniform expression score values that I am getting, and his plots have a different scale for the gene module plot expression scores (see photos). I tried using the detect_genes function to set a minimum expression value to exclude very low-expressing genes from my data, but this did not fix the issue. Neither does adjusting the q value and resolution for the gene module function code. We would appreciate any input that you can provide on this issue. Thank you!
Package versions: Abnormal gene modules were obtained using monocle3 versions 0.2.1 and 0.2.1.5, and ggplot2 version 3.3.0. We do not know which earlier version of these packages my PI used to obtain the desirable gene modules.
Example gene module plots obtained by my PI using older monocle3 and ggplot versions:
Example gene module plots obtained with updated monocle3 and ggplot:
Session(info) sessionInfo() R version 3.6.1 (2019-07-05) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS High Sierra 10.13.6
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages: [1] shiny_1.4.0.2 ggplot2_3.3.0 magrittr_1.5 dplyr_0.8.5 tibble_3.0.0
[6] stringr_1.4.0 viridis_0.5.1 viridisLite_0.3.0 monocle3_0.2.1 SingleCellExperiment_1.6.0 [11] SummarizedExperiment_1.14.1 DelayedArray_0.10.0 BiocParallel_1.18.1 matrixStats_0.56.0 GenomicRanges_1.36.1
[16] GenomeInfoDb_1.20.0 IRanges_2.18.3 S4Vectors_0.22.1 Biobase_2.44.0 BiocGenerics_0.30.0
loaded via a namespace (and not attached): [1] ggbeeswarm_0.6.0 colorspace_1.4-1 grr_0.9.5 deldir_0.1-25 ellipsis_0.3.0
[6] class_7.3-16 XVector_0.24.0 BiocNeighbors_1.2.0 rstudioapi_0.11 proxy_0.4-23
[11] farver_2.0.3 ggrepel_0.8.2 RSpectra_0.16-0 fansi_0.4.1 codetools_0.2-16
[16] splines_3.6.1 knitr_1.28 scater_1.12.2 speedglm_0.3-2 jsonlite_1.6.1
[21] pheatmap_1.0.12 uwot_0.1.8 compiler_3.6.1 assertthat_0.2.1 Matrix_1.2-18
[26] fastmap_1.0.1 cli_2.0.2 later_1.0.0 BiocSingular_1.0.0 htmltools_0.4.0
[31] tools_3.6.1 rsvd_1.0.3 igraph_1.2.5 coda_0.19-3 gtable_0.3.0
[36] glue_1.4.0 GenomeInfoDbData_1.2.1 RANN_2.6.1 reshape2_1.4.4 gmodels_2.18.1
[41] Rcpp_1.0.4.6 slam_0.1-47 raster_3.0-12 vctrs_0.2.4 spdep_1.1-3
[46] gdata_2.18.0 nlme_3.1-145 DelayedMatrixStats_1.6.1 xfun_0.12 mime_0.9
[51] lifecycle_0.2.0 irlba_2.3.3 gtools_3.8.2 LearnBayes_2.15.1 zlibbioc_1.30.0
[56] MASS_7.3-51.5 scales_1.1.0 promises_1.1.0 expm_0.999-4 RColorBrewer_1.1-2
[61] leidenbase_0.1.0 gridExtra_2.3 Matrix.utils_0.9.8 stringi_1.4.6 e1071_1.7-3
[66] boot_1.3-24 spData_0.3.5 rlang_0.4.5 pkgconfig_2.0.3 bitops_1.0-6
[71] lattice_0.20-41 purrr_0.3.3 sf_0.9-1 labeling_0.3 tidyselect_1.0.0
[76] RcppAnnoy_0.0.16 plyr_1.8.6 R6_2.4.1 DBI_1.1.0 pillar_1.4.3
[81] withr_2.1.2 units_0.6-6 RCurl_1.98-1.1 sp_1.4-1 batchelor_1.0.1
[86] crayon_1.3.4 KernSmooth_2.23-16 grid_3.6.1 digest_0.6.25 classInt_0.4-3
[91] pbmcapply_1.5.0 xtable_1.8-4 tidyr_1.0.2 httpuv_1.5.2 munsell_0.5.0
[96] beeswarm_0.2.3 vipor_0.4.5
Here is my code: rm(list= ls()) library(monocle3) library(viridis) library(stringr) library(tibble) library(dplyr) library(magrittr) library(ggplot2)
set resident directory
RES_DIR <-file.path("~/scrnaseq/10X_outs")
load cds
E9cds<-load_cellranger_data("~/scrnaseq/10X_outs/E9/A404S2_SI-GA-B7") colData(E9cds)$identity = "E9" E9.5cds<-load_cellranger_data("~/scrnaseq/10X_outs/E9.5") colData(E9.5cds)$identity = "E9.5"
combine cds
E9toE9.5cds <- combine_cds(list(E9cds,E9.5cds))
pre-process cds
E9toE9.5cds <- preprocess_cds(E9toE9.5cds) plot_pc_variance_explained(E9toE9.5cds)
align cds to remove batch effects and reduce dimensions
E9toE9.5cds = align_cds(E9toE9.5cds, alignment_group = "sample") E9toE9.5cds = reduce_dimension(E9toE9.5cds) plot_cells(E9toE9.5cds, color_cells_by = "identity", label_cell_groups = FALSE, cell_size = .4)
cluster cells
E9toE9.5cds = cluster_cells(E9toE9.5cds) plot_cells(E9toE9.5cds, cell_size = .4)
subset cds
E9toE9.5_subset <- choose_cells(E9toE9.5cds) plot_cells(E9toE9.5_subset, color_cells_by = "identity", label_cell_groups = FALSE)
cluster cells
E9toE9.5_subset = cluster_cells(E9toE9.5_subset) plot_cells(E9toE9.5_subset, label_cell_groups = FALSE)
make pseudotime trajectory graph
E9toE9.5_subset <- learn_graph(E9toE9.5_subset) E9toE9.5_subset <- order_cells(E9toE9.5_subset) plot_cells(E9toE9.5_subset, color_cells_by = "pseudotime", label_cell_groups = FALSE, label_leaves = FALSE, label_branch_points = FALSE, graph_label_size = 1, cell_size = 0.4)
find gene modules
subset_pr_test_res <- graph_test(E9toE9.5_subset, neighbor_graph="principal_graph", cores=2) pr_deg_ids <- row.names(subset(subset_pr_test_res, q_value < 0.000001)) gene_module_df <- find_gene_modules(E9toE9.5_subset[pr_deg_ids,], resolution=0.00013)
organize gene modules
agg_mat <- aggregate_gene_expression(E9toE9.5_subset, gene_module_df) module_dendro <- hclust(dist(agg_mat)) gene_module_df$module <- factor(gene_module_df$module, levels = row.names(agg_mat)[module_dendro$order])
plot gene modules in UMAP
plot_cells(E9toE9.5_subset, cell_size = 0.4, genes=gene_module_df, label_cell_groups=FALSE, show_trajectory_graph=FALSE)