HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
66 stars 31 forks source link

Clarification for arguments features and color_by in plotNRS() #355

Closed denvercal1234GitHub closed 1 year ago

denvercal1234GitHub commented 1 year ago

Hi there,

Thank you again for the package. It contain very useful QC functions!

I was hoping to get some clarification on the function plotNRS().

  1. When I used a subset of markers to do the clustering, and I set features = markers_used_for_clustering, I will get NRS scores for these clustering markers. But, if I used a subset of markers to do the clustering, and now I set features = NULL, do the NRS scores produced by plotNRS() for the markers that were not used for clustering still represent these not-selected-for-clustering markers' contribution to the resulting clusters?

  2. When I set color_by = "cluster_id", plotNRS() produces NRS scores for each cell coloured by cluster IDs of 100 groups. Would you mind giving me some suggestions on how to plot NRS scores for the number of clusters at the meta clustering level instead?

Thank you for your help!

HelenaLC commented 1 year ago
  1. The NRS does not tell you each marker's "contribution" to the clusters. Markers with higher scores explain a larger portion of variability in a given sample. So we assume that to be biologically meaningful, and hence recommend using high NRS features for clustering. You can read more about this in the original paper here. The features = ... argument simply says which NRS to compute & plot. Scores won't change, however, regardless of what you include or not (the cluster assignments are not being used / don't need to exist at this point!).
  2. All visualization functions in CATALYST have an argument k to specify which clustering to use. So you'd want to set k = "meta20", for example.
denvercal1234GitHub commented 1 year ago

Thank you very much for your response! It is very useful. For Q2, just so I understand it correctly, k="meta20" is available for all plotting functions, except for plotNRS(), because only color_by is available for plotNRS and not k.

So if I want to plot NRS scores for a select group of clusters after clustering, I will need to just subset those clusters, then run plotNRS() on the resulting subsetted sce object, right?

Thank you again for your help.

HelenaLC commented 1 year ago

Yes, that’s all correct!