cole-trapnell-lab / monocle3

Other
349 stars 102 forks source link

Problem with UMAP clustering #472

Open earleya9 opened 3 years ago

earleya9 commented 3 years ago

I am observing a different cell type clustering UMAP compared to the "Constructing single-cell trajectories" vignette. Can you please let me know how I can fix this?

This is my info:

R version 3.6.2 (2019-12-12)

library(monocle3) Loading required package: Biobase Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following objects are masked from ‘package:stats’: IQR, mad, sd, var, xtabs The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which, which.max, which.min Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. Loading required package: SingleCellExperiment Loading required package: SummarizedExperiment Loading required package: GenomicRanges Loading required package: stats4 Loading required package: S4Vectors package ‘S4Vectors’ was built under R version 3.6.3 Attaching package: ‘S4Vectors’ The following object is masked from ‘package:base’: expand.grid Loading required package: IRanges Loading required package: GenomeInfoDb package ‘GenomeInfoDb’ was built under R version 3.6.3Loading required package: DelayedArray package ‘DelayedArray’ was built under R version 3.6.3Loading required package: matrixStats Attaching package: ‘matrixStats’ The following objects are masked from ‘package:Biobase’: anyMissing, rowMedians Loading required package: BiocParallel Attaching package: ‘DelayedArray’ The following objects are masked from ‘package:matrixStats’: colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges The following objects are masked from ‘package:base’: aperm, apply, rowsum Registered S3 method overwritten by 'dplyr': method from print.rowwise_df
Attaching package: ‘monocle3’ The following objects are masked from ‘package:Biobase’: exprs, fData, fData<-, pData, pData<-

expression_matrix <- readRDS(url("http://staff.washington.edu/hpliner/data/packer_embryo_expression.rds")) cell_metadata <- readRDS(url("http://staff.washington.edu/hpliner/data/packer_embryo_colData.rds")) gene_annotation <- readRDS(url("http://staff.washington.edu/hpliner/data/packer_embryo_rowData.rds")) cds <- new_cell_data_set(expression_matrix,

  • cell_metadata = cell_metadata,
  • gene_metadata = gene_annotation) cds <- preprocess_cds(cds, num_dim = 50) cds <- align_cds(cds, alignment_group = "batch", residual_model_formula_str = "~ bg.300.loading + bg.400.loading + bg.500.1.loading + bg.500.2.loading + bg.r17.loading + bg.b01.loading + bg.b02.loading") Aligning cells from different batches using Batchelor. Please remember to cite: Haghverdi L, Lun ATL, Morgan MD, Marioni JC (2018). 'Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.' Nat. Biotechnol., 36(5), 421-427. doi: 10.1038/nbt.4091 cds <- reduce_dimension(cds) No preprocess_method specified, and aligned coordinates have been computed previously. Using preprocess_method = 'Aligned' plot_cells(cds, label_groups_by_cluster=FALSE, color_cells_by = "cell.type") No trajectory to plot. Has learn_graph() been called yet?
Screen Shot 2021-01-25 at 2 06 43 PM
brgew commented 3 years ago

Hi,

I believe that the uwot::umap function used by Monocle3 is not deterministic so its output can vary depending on a number of factors including the uwot version used, the parameters used in the umap call, and possibly the random number generators used. Nonetheless, I expect to see similarities between most runs. Does this help?