SCA-IRCM / SingleCellSignalR_v1

R package
26 stars 17 forks source link

relabelling seurat clusters #10

Open moeedakbar opened 4 years ago

moeedakbar commented 4 years ago

Hi

I have seurat object, and your package work great!

However, it only lets me find the cluster interactions when the clusters are numeric.
cluster = as.numeric (seuratobject@active.ident). The clusters get rename cluster 1, cluster 2 etc But my clusters are labelled with the cell type, how do I keep those table, ie "Tcells", "macrophages" etc

Many Thanks Moeed

SCA-IRCM commented 4 years ago

Hi, If I understand well, I think you can use the c.names argument present in most functions of SingleCellSignalR. cluster is a numeric vector that attributes each cell to a number (from 1 to max(cluster)) and c.names are the names of these clusters (so that length(c.names) = max(cluster)). Hope this helps, if not don't hestitate to feed this issue.

Thanks for using SingleCellSignalR,

SCA

benjytan88 commented 4 years ago

Hi @SCA-IRCM , I'm a bit confused on how should I specify the c.names, especially the order. When I designate cluster using cluster <- as.numeric(Idents(so)) and check using unique(cluster), the output is as below:

unique(cluster)
 [1]  1  8  2 14 15  3 11 13 12  7  6  9 10  5 16  4

So when I prepare the c.names, how should I specify the order? Should I go numerically from 1 to max or follow the unique(cluster) arrangement?

SCA-IRCM commented 4 years ago

Hi, The c.names argument must go from 1 to max, meaning that the first element of c.names is the name of the first cluster (cluster = 1), the second element is the name of the second cluster (cluster = 2) and so on. Hope I'm clear.

Thanks for using SingleCellSignalR.

SCA

benjytan88 commented 4 years ago

Hi, The c.names argument must go from 1 to max, meaning that the first element of c.names is the name of the first cluster (cluster = 1), the second element is the name of the second cluster (cluster = 2) and so on. Hope I'm clear.

Thanks for using SingleCellSignalR.

SCA

Hi, thank you for your prompt reply.

Meaning that if my clusters are as such:

cluster 0 = T cells
cluster 5 = B cells
cluster 4 = Monocytes
cluster 1 = NK cells
cluster 2 = Proliferating cells
cluster 3 = Unknown

my c.names should be c("T cells", "NK cells", "Proliferating cells", "Unknown", "Monocytes", "B cells"). The order of the cluster numbers doesn't matter here, am I right? And is it OK to have duplicates cluster names? If not, how can I handle duplicate cell names as some cell clusters are made up of several Seurat clusters.

In addition to that, is it OK if there are dropped numbers in the clusters? For example, I have 22 Seurat clusters, but I subsetted them to remove unwanted cells, so my clusters are missing some numbers and left with 15 clusters. Will that affect the numbering?

SCA-IRCM commented 4 years ago

Hi, I confirm that c.names should be c("T cells", "NK cells", "Proliferating cells", "Unknown", "Monocytes", "B cells"). Besides, it is not OK to have duplicates in cluster names, I can advise you to iterate the c.names (for example, T-cells.1, T-cells.2, etc). Also you need to rescale your cluster vector to avoid holes. I use this type of code:

cluster <- c(1,3,3,2,3,3,2,1,1,5,5,1,5,2,3,5) #there is no cluster 4
cluster[cluster>4)] <- cluster[cluster>4)] - 1

And so on, so that length(c.names)==max(cluster).