satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 912 forks source link

which metrics can say wnn cluster result is better #7791

Closed lvmt closed 1 year ago

lvmt commented 1 year ago

hi: wnn method is a good idea.

nowadaya, i'm learning seurat tutorial about wnn method. And as the tutorial say, wnn method sometimes can do bettter for cluster.
and i want to know if there is some metrics can support?

As i know, Silhouette scores is a metrics if the data is unlabeled , and Seurat can support the analysis.

but the upstream input is pca result, While in the wnn analysis, different data run pca independent, so i don't how to get "Silhouette scores" with multimodal single-cell data.

need your help and thanks again.

rsatija commented 1 year ago

WNN analysis computes a weighted nearest neighbor graph. in principle you could utilize graph-based distance metrics to compute silhouette scores but we don't support this natively.

in the WNN paper (Hao et al Cell 2021), we do propose a series of metrics - especially with CITE-seq data - that can help to show clustering improvements based on a weighted combination of two modalities. However while benchmarking approaches are valuable and needed, for any individual dataset, it cannot fully replace biological analysis and interpretation

nandini0910 commented 7 months ago

Hello @rsatija ! I want to use the graph-based distance metrics to compute the silhouette scores for my WNN analysis. After running the FindMultiModalNeighbors() function, I used the Distances() to get the matrix. To calculate the silhouette score, I converted the matrix to a square matrix. However, the Silhouette scores are all negative and it should not be like this. How would you recommend going about this? Screen Shot 2024-03-22 at 1 11 38 PM

Your advice would be greatly appreciated.

## Multi Modal
set.seed(42)
standard_WNN <- FindMultiModalNeighbors(standard_WNN, reduction.list = list("pca", "apca"), dims.list = list(1:25, 1:25), modality.weight.name = "RNA.weight", return.intermediate = TRUE)

## Find distance matrix from seurat object
distance_matrix <- Distances(standard_WNN[["weighted.nn"]])

## convert the matrix to a square matrix
n_elements <- nrow(distance_matrix)
distance_matrix_square <- matrix(0, n_elements, n_elements)
upper_tri_indices <- upper.tri(distance_matrix_square)
distance_matrix_square[upper_tri_indices] <- distance_matrix
distance_matrix_square <- t(distance_matrix_square)
distance_matrix_square[upper_tri_indices] <- distance_matrix
dim(distance_matrix_square)
length(distance_matrix_square)
distance_matrix <- distance_matrix_square

## Calculate the score
Idents(standard_WNN) <- 'seurat_clusters'
clusters <- standard_WNN@meta.data$seurat_clusters
silhouette <- cluster::silhouette(as.numeric(clusters), dist = distance_matrix)
standard_WNN@meta.data$silhouette_score <- silhouette[,3]
mean_silhouette_score <- mean(standard_WNN@meta.data$silhouette_score)