Open denvercal1234GitHub opened 1 year ago
Feel free to try that approach and assess if it leads to any improvement. However, I suggest examining the FlowSOM_cluster
column first, which provides the SOM node allocations for the cells. If you find that the cells in the orange cluster are only assigned to a small number of SOM nodes, it may be worth considering increasing the grid size to capture more variability. Alternatively, you can explore using a different algorithm such as run.phenograph
to see if it is more effective.
Thank you @ghar1821 for your input!
Would you mind giving me more guidance on how to increase the grid size?
To run.phenograph
, do I simply run it in place of run.flowsom
on my cell.dat
object and nothing else changes? Is the k
parameter the max number of clusters?
######### CASE 0
cell.dat <- Spectre::run.phenograph(cell.dat, use.cols = strict_NoManual, clust.name = "Phenograph_strict_NoManual", k=45)
run.flowsom
multiple times, each with different meta.k
and markers used for clustering, I would only need to run.umap
as below, and then, it would allow me to use visualise the result of different run.flowsom
on the right cell.dat.sub
object's UMAP as below?#### CASE 1
cell.dat <- Spectre::run.flowsom(cell.dat, strict_NoManual, meta.k = 'auto', meta.clust.name= "FlowSOM_metacluster_backbonestrict_NoManualAuto", clust.name = "FlowSOM_cluster_backbonestrict_NoManualAuto")
#### CASE 2
cell.dat <- Spectre::run.flowsom(cell.dat, strict_NoManual, meta.k = 'auto', max.meta = 40, meta.clust.name= "FlowSOM_metacluster_backbonestrict_NoManualAutoMax40", clust.name = "FlowSOM_cluster_backbonestrict_NoManualAutoMax40")
#### CASE 3
cell.dat <- Spectre::run.flowsom(cell.dat, relax_NoManual, meta.k = 'auto', meta.clust.name= "FlowSOM_metacluster_backbonerelax_NoManualAuto", clust.name = "FlowSOM_cluster_backbonerelax_NoManualAuto")
#### CASES 0, 1, AND 2
cell.dat.sub_strict_NoManual <- run.umap(cell.dat.sub, use.cols=strict_NoManual)
##### CASES 3
cell.dat.sub_relax_NoManual <- run.umap(cell.dat.sub, use.cols=relax_NoManual)
###### VISUALIZE CASE 0
make.colour.plot(cell.dat.sub_backbone, "UMAP_X", "UMAP_Y", col.axis = "Phenograph_strict_NoManual", col.type = 'factor')
###### VISUALIZE CASE 1
make.colour.plot(cell.dat.sub_backbone, "UMAP_X", "UMAP_Y", col.axis = "FlowSOM_metacluster_backbonestrict_NoManualAuto", col.type = 'factor')
###### VISUALIZE CASE 2
make.colour.plot(cell.dat.sub_backbone, "UMAP_X", "UMAP_Y", col.axis = "FlowSOM_metacluster_backbonestrict_NoManualAutoMax40", col.type = 'factor')
###### VISUALIZE CASE 3
make.colour.plot(cell.dat.sub_backbone, "UMAP_X", "UMAP_Y", col.axis = "FlowSOM_metacluster_backbonerelax_NoManualAuto", col.type = 'factor')
You can increase the grid size by increasing the xdim
and ydim
parameter. Do note though that the size of the grid used by FlowSOM is xdim
* ydim
. Hence, adding even just 1 to xdim
and ydim
will substantially increase the grid size.
The k parameter in phenograph does not represent the number of clusters. The k parameter affects the resolution of the clusters. The smaller the number, the smaller the size of the clusters you will get (clusters will have less cells). Hence, if you want to get more clusters, reduce the k, otherwise increase it. Every dataset is different, and there is no one k value that will rule it all. I suggest you experiment with various values, and maybe start at 5 (5 tend to work well for me in the past).
I'm not quite sure what you mean by "visualise the result of different run.flowsom on the right cell.dat.sub object's UMAP as below?". The code you wrote doesn't make sense to me as I don't know what cell.dat.sub_backbone
is, and you seem to not use either cell.dat.sub_strict_NoManual
or cell.dat.sub_relax_NoManual
in make.colour.plot
.
Thank you @ghar1821 for your response!
For question 3, basically I have 3 sets of markers (backbone markers, strict_markers, and relax_markers) that I want to use to do the clustering on my cell.dat object. So, I first run run.flowsom
on my cell.dat object for each of these marker sets with the corresponding names for meta.clust.name
and clust.name
for each set.
My question 3 was then how should I do run.umap
to visualise the clusters of these 3 runs? Do I need to do run.umap
3 times with each time having the respective use.cols
?
oh i see what you mean now. I suppose that is one way of doing it, repeat run.flowsom
and run.umap
3 times, each with different set of markers. By doing this though, bear in mind that the umap plot will look different as the coordinates are calculated based on different sets of markers.
I guess my next question will be, what are you trying to find from these umaps? Are you trying to compare how the clusters differ if given different sets of markers? If that is the case, it may make more sense to run umap once, and visualise the clusters 3 times (1 colour plot per set of markers). If doing this, then you will have to decide, what markers should I feed into the umap that shall allow me to best visualise all 3 results. Perhaps a combination of all 3 sets of markers? Or maybe just the markers common to all 3 sets.
Hi @denvercal1234GitHub , we also have a workflow for 'multi-level' clustering (see Figure 4 here: https://onlinelibrary.wiley.com/doi/10.1002/cyto.a.24350). Essentially we do a first round of clustering to identify major groups of cells (e.g. CD4, CD8, B cells etc) and then on each lineage, we do another level to gain more detail (e.g. Naive, Central Memory, Effector Memory, etc). We don't have a script for this online, but I can send you what I use if it would be helpful?
Hi @tomashhurst -- That would be very useful if I could have the script for the multi-level clustering when you get a chance?
Also @ghar1821 and Thomas, in CATALYST
(https://bioconductor.org/packages/release/bioc/vignettes/CATALYST/inst/doc/differential.html), they have this delta_area()
function that can help us determine the optimal number of clusters and a plotNRS()
to help select the markers for clustering. Do we have anything similar or would you mind letting me know how we might be able to still use these two functions in Spectre
workflow?
Thanks again very much!
Hi @tomashhurst - I hope all is well, and thanks for your help earlier. I wonder if you would not mind emailing me the script you mentioned for the multi-level clustering? quang.n.nguyen@alumni.duke.edu Thanks so much again!
Hi there,
Thanks for the tool!
From #161,
run.flowsom
seemed to be unable to break up the big orange cluster but the UMAP seemed to suggest there are potentially more clusters within it. Increasingmeta.k
did not do it either.I was wondering if you could give some advice on how to sub-cluster? Do I simply subset the
cell.dat
then runrun.flowsom
again directly?Thanks again!