Closed sinvaya closed 4 years ago
This issue has been brought up on the Seurat issues page as well, where a user there has posted a script that will work (although I personally had to make some modifications to get it to work for my purposes, namely changing the cluster source and carrying over the RNA assay matrix for differential analysis rather than the integrated one, as the UMAP is preserved either way).
Hi Robert,
I am also having some issues with the script
Would you be able to share the modifications to it you made?
Thanks for any help!
So it would seem that there's a major issue with porting Seurat objects into Monocle, namely that the integration anchor data takes the place of PCA loadings, which causes issues downstream with pseudotime analysis as graphtest only has access to PCA loadings. Therefore I would suggest that you NOT use integration with monocle at this time. You might choose to try the Align function in Monocle which hypothetically serves a similar purpose, or to just cluster in seurat which, in my case gave me a cleaner result (although in my case I'm examining transcriptional change with the addition of certain chemicals-something align/integrate seemed to muddle somewhat).
Here's what I ran with some personal inputs expunged, you're going to need to find the resolution variables and whatnot yourself. I've still used the first steps as I prefer seurat's filtering features, but I CAN NOT RECOMMEND using the partition onward. Frankly, I've forgotten what most of the changes I made are and why, so take caution when using this.)
# ### Re-dimension reduction for 3D rendering
#
# if (Dim = "3D") {
#
# print ("Running UMAP 3D")
#
# seurat <- RunUMAP(object = seurat, reduction = "pca", dims = 1:20, n.components = 3)
#
# print("Clustering 3D")
#
# seurat <- FindNeighbors(object=seurat, dims=1:20)
# seurat <- FindClusters(object=seurat, resolution=0.5)
# seurat[[sprintf("ClusterNames_%.1f_%dPC", 0.5, 20)]] <- Idents(object = seurat)
#
# }
#
### Building the necessary parts for a basic cds
# part one, gene annotations
seurat <- seuratobj
library(Seurat)
library(monocle3)
library(htmlwidgets)
seurat <- readRDS("WHEREVER/YOU/KEEP/YOURS")
gene_annotation <- as.data.frame(rownames(seurat@reductions[["pca"]]@feature.loadings), row.names = rownames(seurat@reductions[["pca"]]@feature.loadings))
colnames(gene_annotation) <- "gene_short_name"
# part two, cell information
cell_metadata <- as.data.frame(seurat@assays[["RNA"]]@counts@Dimnames[[2]], row.names = seurat@assays[["RNA"]]@counts@Dimnames[[2]])
colnames(cell_metadata) <- "barcode"
# part three, counts sparse matrix
New_matrix <- seurat@assays[["RNA"]]@counts
New_matrix <- New_matrix[rownames(seurat@reductions[["pca"]]@feature.loadings), ]
expression_matrix <- New_matrix
### Construct the basic cds object
cds_from_seurat <- new_cell_data_set(expression_matrix,
cell_metadata = cell_metadata,
gene_metadata = gene_annotation)
### Construct and assign the made up partition
###### I DO NOT ADVISE THIS
recreate.partition <- c(rep(1, length(cds_from_seurat@colData@rownames)))
names(recreate.partition) <- cds_from_seurat@colData@rownames
recreate.partition <- as.factor(recreate.partition)
cds_from_seurat@clusters@listData[["UMAP"]][["partitions"]] <- recreate.partition
### Assign the cluster info
list_cluster <- seurat@meta.data[[sprintf("seurat_clusters")]]
#list_cluster <- seurat@meta.data[[sprintf("ClusterNames_%s_%sPC", 0.5, 20)]]
names(list_cluster) <- seurat@assays[["RNA"]]@data@Dimnames[[2]]
cds_from_seurat@clusters@listData[["UMAP"]][["clusters"]] <- list_cluster
### Could be a space-holder, but essentially fills out louvain parameters
cds_from_seurat@clusters@listData[["UMAP"]][["louvain_res"]] <- "NA"
### Assign UMAP coordinate
cds_from_seurat@reducedDims@listData[["UMAP"]] <-seurat@reductions[["umap"]]@cell.embeddings
### Assign feature loading for downstream module analysis
cds_from_seurat@preprocess_aux$gene_loadings <- seurat@reductions[["pca"]]@feature.loadings
### Learn graph, this step usually takes a significant period of time for larger samples
print("Learning graph, which can take a while depends on the sample")
cds_from_seurat <- learn_graph(cds_from_seurat, use_partition = T)
plot_cells(cds_from_seurat)
####Here I chose to save the gene_metadata, rds, etc.`
Great, thanks for the help!
Best,
Joshua
Hello,
This issue appears to be an issue about our new package, Monocle 3. Please post any issues for Monocle 3 to the monocle3 repository at https://github.com/cole-trapnell-lab/monocle3.
Monocle 3 Development team
So it would seem that there's a major issue with porting Seurat objects into Monocle, namely that the integration anchor data takes the place of PCA loadings, which causes issues downstream with pseudotime analysis as graphtest only has access to PCA loadings. Therefore I would suggest that you NOT use integration with monocle at this time. You might choose to try the Align function in Monocle which hypothetically serves a similar purpose, or to just cluster in seurat which, in my case gave me a cleaner result (although in my case I'm examining transcriptional change with the addition of certain chemicals-something align/integrate seemed to muddle somewhat).
Here's what I ran with some personal inputs expunged, you're going to need to find the resolution variables and whatnot yourself. I've still used the first steps as I prefer seurat's filtering features, but I CAN NOT RECOMMEND using the partition onward. Frankly, I've forgotten what most of the changes I made are and why, so take caution when using this.)
# ### Re-dimension reduction for 3D rendering # # if (Dim = "3D") { # # print ("Running UMAP 3D") # # seurat <- RunUMAP(object = seurat, reduction = "pca", dims = 1:20, n.components = 3) # # print("Clustering 3D") # # seurat <- FindNeighbors(object=seurat, dims=1:20) # seurat <- FindClusters(object=seurat, resolution=0.5) # seurat[[sprintf("ClusterNames_%.1f_%dPC", 0.5, 20)]] <- Idents(object = seurat) # # } # ### Building the necessary parts for a basic cds # part one, gene annotations seurat <- seuratobj library(Seurat) library(monocle3) library(htmlwidgets) seurat <- readRDS("WHEREVER/YOU/KEEP/YOURS") gene_annotation <- as.data.frame(rownames(seurat@reductions[["pca"]]@feature.loadings), row.names = rownames(seurat@reductions[["pca"]]@feature.loadings)) colnames(gene_annotation) <- "gene_short_name" # part two, cell information cell_metadata <- as.data.frame(seurat@assays[["RNA"]]@counts@Dimnames[[2]], row.names = seurat@assays[["RNA"]]@counts@Dimnames[[2]]) colnames(cell_metadata) <- "barcode" # part three, counts sparse matrix New_matrix <- seurat@assays[["RNA"]]@counts New_matrix <- New_matrix[rownames(seurat@reductions[["pca"]]@feature.loadings), ] expression_matrix <- New_matrix ### Construct the basic cds object cds_from_seurat <- new_cell_data_set(expression_matrix, cell_metadata = cell_metadata, gene_metadata = gene_annotation) ### Construct and assign the made up partition ###### I DO NOT ADVISE THIS recreate.partition <- c(rep(1, length(cds_from_seurat@colData@rownames))) names(recreate.partition) <- cds_from_seurat@colData@rownames recreate.partition <- as.factor(recreate.partition) cds_from_seurat@clusters@listData[["UMAP"]][["partitions"]] <- recreate.partition ### Assign the cluster info list_cluster <- seurat@meta.data[[sprintf("seurat_clusters")]] #list_cluster <- seurat@meta.data[[sprintf("ClusterNames_%s_%sPC", 0.5, 20)]] names(list_cluster) <- seurat@assays[["RNA"]]@data@Dimnames[[2]] cds_from_seurat@clusters@listData[["UMAP"]][["clusters"]] <- list_cluster ### Could be a space-holder, but essentially fills out louvain parameters cds_from_seurat@clusters@listData[["UMAP"]][["louvain_res"]] <- "NA" ### Assign UMAP coordinate cds_from_seurat@reducedDims@listData[["UMAP"]] <-seurat@reductions[["umap"]]@cell.embeddings ### Assign feature loading for downstream module analysis cds_from_seurat@preprocess_aux$gene_loadings <- seurat@reductions[["pca"]]@feature.loadings ### Learn graph, this step usually takes a significant period of time for larger samples print("Learning graph, which can take a while depends on the sample") cds_from_seurat <- learn_graph(cds_from_seurat, use_partition = T) plot_cells(cds_from_seurat) ####Here I chose to save the gene_metadata, rds, etc.`
Hello, I followed your script and I still get the error: Error: No dimensionality reduction for UMAP calculated. Please run reduce_dimensions with reduction_method = UMAP and cluster_cells before running learn_graph.
Any idea how to solve this? It seems like it doesn't recognize the dimensionality reduction from Seurat in order to create trajectories..or this is what I understand.
Thank you in advance!
This has worked for me - good luck!
#### Create a Monocle CDS Object
# Project PC dimensions to whole data set
my.so <- ProjectDim(my.so, reduction = "pca")
# Create an expression matrix
expression_matrix <- my.so@assays$RNA@counts
# Get cell metadata
cell_metadata <- my.so@meta.data
if (all.equal(colnames(expression_matrix), rownames(cell_metadata))) {
print(sprintf("Cell identifiers match"))
} else {
print(sprintf("Cell identifier mismatch - %i cells in expression matrix, %i cells in metadata",
ncol(expression_matrix), nrow(cell_metadata)))
print("If the counts are equal, sort differences will throw this error")
}
# get gene annotations
gene_annotation <- data.frame(gene_short_name = rownames(my.so@assays$RNA), row.names = rownames(my.so@assays$RNA))
if (all.equal(rownames(expression_matrix), rownames(gene_annotation))) {
print(sprintf("Gene identifiers all match"))
} else {
print(sprintf("Gene identifier mismatch - %i genes in expression matrix, %i gene in gene annotation",
nrow(expression_matrix), nrow(gene_annotation)))
print("If the counts are equal, sort differences will throw this error")
}
# Seurat-derived CDS
my.cds <- new_cell_data_set(expression_matrix,
cell_metadata = cell_metadata,
gene_metadata = gene_annotation)
# Transfer Seurat embeddings
# Note that these may be calculated on the Integrated object, not the counts
# and thus will involve fewer genes
reducedDim(my.cds, type = "PCA") <- my.so@reductions$pca@cell.embeddings
my.cds@preprocess_aux$prop_var_expl <- my.so@reductions$pca@stdev
plot_pc_variance_explained(my.cds)
# Transfer Seurat UMAP embeddings
my.cds@int_colData@listData$reducedDims$UMAP <- my.so@reductions$umap@cell.embeddings
# plot_cells(my.cds)
# Copy cluster info from Seurat
my.cds@clusters$UMAP_so$clusters <- my.so@meta.data$gt_tp_cell_type_integrated_.0.9
my.cds <- cluster_cells(my.cds, reduction_method = "UMAP", resolution = 1e-3)
# Fix from https://gitmemory.com/cole-trapnell-lab
rownames(my.cds@principal_graph_aux$UMAP$dp_mst) <- NULL
colnames(my.cds@int_colData@listData$reducedDims$UMAP) <- NULL
DimPlot(so, reduction = "umap")
DimPlot(my.so, reduction = "umap")
plot_cells(my.cds, color_cells_by = "partition", group_label_size = 3.5)
plot_cells(my.cds, color_cells_by = cellids, show_trajectory_graph = FALSE, group_label_size = 3.5)
Added a row/column name clean up so that you can run pseudo time... Apologies - it was strung across two scripts.
Hi,
We want to use monocle3 for pseudotime analyze. But it generate a totally different UMAP than Seurat and it split into too many clusters. Monocle3 generates pseudotime based on UMAP. Can we use the UMAP of Seurat?
Thanks!