cole-trapnell-lab / monocle-release

276 stars 116 forks source link

plot_genes_in_pseudotime query #168

Open julalipa opened 6 years ago

julalipa commented 6 years ago

Hi, I am using Monocle to look at differential gene expression across a pseudotime trajectory. I have successfully ordered my cells and I am now trying to use plot_genes_in_pseudotime as well as plot_pseudotime_heatmap on the top differential genes.

My question and concern is that sometime the smooth expression curves that are generated do not really fit the data. Please see example below, where one gene looks fine (mt-Cytb) but the other spline (Ighm) is slightly puzzling to me. I have tried to change the df in the trend_formula argument but this does not really improve the result.

Are the smooth expression curves generated in plot_genes_in_pseudotime the same as those used in plot_pseudotime_heatmap? If so, how can I avoid this type of curve fitting? Have I missed something in my workflow?

Any help would be greatly appreciated! :) Thanks a lot!

image

Code:

monocle <- importCDS(seurat.object, import_all = TRUE) monocle <- estimateSizeFactors(monocle) monocle <- suppressWarnings(estimateDispersions(monocle)) monocle <- detectGenes(monocle, min_expr = 0.1) expressed_genes <- row.names(subset(fData(monocle), num_cells_expressed >= 10)) diff_test_res <- differentialGeneTest(monocle[expressed_genes,], fullModelFormulaStr = "~orig.ident") ordering_genes <- row.names (subset(diff_test_res, qval < 0.01)) monocle <- setOrderingFilter(monocle, ordering_genes) monocle <- reduceDimension(monocle, max_components=2, method = 'DDRTree') monocle <- orderCells(monocle) plot_cell_trajectory(monocle, color_by = "orig.ident") plot_cell_trajectory(monocle, color_by = "State")

GM_state <- function(cds){ if (length(unique(pData(cds)$State)) > 1){ T0_counts <- table(pData(cds)$State, pData(cds)$orig.ident)[,”D0”] return(as.numeric(names(T0_counts)[which (T0_counts == max(T0_counts))])) } else { return (1) } }

monocle <- orderCells(monocle, root_state = GM_state(monocle)) plot_cell_trajectory(monocle, color_by = "Pseudotime")

diff_test_res <- differentialGeneTest(monocle[expressed_genes,], fullModelFormulaStr = "~sm.ns(Pseudotime, df=3)") sig_gene_names <- rownames(diff_test_res[head(order(diff_test_res$qval),100),]) plot_genes_in_pseudotime(monocle[sig_gene_names[3:4],], ncol=2)

Xiaojieqiu commented 6 years ago

Hi @julalipa it seems like the Ighm has very low gene expression (see the vertical axis--- it ranges from 1e-35 to 1e-2). For very low gene expression like this, the spline line fitting may just break. I guess you need to filter genes based on mean gene expression to look for reasonable expression trend.